[linux-lvm] How well tested is the snapshot feature?

All of lore.kernel.org
 help / color / mirror / Atom feed

* [linux-lvm] How well tested is the snapshot feature?
@ 2002-06-07  4:27 Stephan Austermuehle
  2002-06-07  4:33 ` Joe Thornber
  0 siblings, 1 reply; 14+ messages in thread
From: Stephan Austermuehle @ 2002-06-07  4:27 UTC (permalink / raw)
  To: linux-lvm

Hi,

are there any known issues with the snapshot feature in LVM 1.0.4?
Using it for backups of XFS causes trouble and leads to kernel crashes
regularly. For the moment I can give hardly any details because XFS
filled the log file with zeros after crashing. Sometimes the kernel
oopses when running xfsdump, sometimes when I do lvremove on the
snapshot device.

Stephan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07  4:27 [linux-lvm] How well tested is the snapshot feature? Stephan Austermuehle
@ 2002-06-07  4:33 ` Joe Thornber
  2002-06-07  4:37   ` Stephan Austermuehle
  0 siblings, 1 reply; 14+ messages in thread
From: Joe Thornber @ 2002-06-07  4:33 UTC (permalink / raw)
  To: linux-lvm

On Fri, Jun 07, 2002 at 11:27:10AM +0200, Stephan Austermuehle wrote:
> Hi,
> 
> are there any known issues with the snapshot feature in LVM 1.0.4?
> Using it for backups of XFS causes trouble and leads to kernel crashes
> regularly. For the moment I can give hardly any details because XFS
> filled the log file with zeros after crashing. Sometimes the kernel
> oopses when running xfsdump, sometimes when I do lvremove on the
> snapshot device.

Did you apply the VFS patch from the 1.0.3 release ?  This makes sure
that the filesystem flushes a consistent state to the disk before the
snapshot is setup.

- Joe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07  4:33 ` Joe Thornber
@ 2002-06-07  4:37   ` Stephan Austermuehle
  2002-06-07  4:45     ` Joe Thornber
  0 siblings, 1 reply; 14+ messages in thread
From: Stephan Austermuehle @ 2002-06-07  4:37 UTC (permalink / raw)
  To: linux-lvm

On Fri, Jun 07, 2002 at 10:33:00AM +0100, Joe Thornber wrote:

> Did you apply the VFS patch from the 1.0.3 release ?  This makes
> sure that the filesystem flushes a consistent state to the disk
> before the snapshot is setup.

Do you mean the VFS lock patch? Yes, it is applied.

Stephan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07  4:37   ` Stephan Austermuehle
@ 2002-06-07  4:45     ` Joe Thornber
  2002-06-07  4:56       ` Stephan Austermuehle
  2002-06-07 11:27       ` Adrian Head
  0 siblings, 2 replies; 14+ messages in thread
From: Joe Thornber @ 2002-06-07  4:45 UTC (permalink / raw)
  To: linux-lvm

On Fri, Jun 07, 2002 at 11:37:05AM +0200, Stephan Austermuehle wrote:
> On Fri, Jun 07, 2002 at 10:33:00AM +0100, Joe Thornber wrote:
> 
> > Did you apply the VFS patch from the 1.0.3 release ?  This makes
> > sure that the filesystem flushes a consistent state to the disk
> > before the snapshot is setup.
> 
> Do you mean the VFS lock patch? Yes, it is applied.

Curious, as far as I know people are using LVM1 snapshots successfully
with xfs.

- Joe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07  4:45     ` Joe Thornber
@ 2002-06-07  4:56       ` Stephan Austermuehle
  2002-06-07  5:08         ` Joe Thornber
  2002-06-07 11:27       ` Adrian Head
  1 sibling, 1 reply; 14+ messages in thread
From: Stephan Austermuehle @ 2002-06-07  4:56 UTC (permalink / raw)
  To: linux-lvm

On Fri, Jun 07, 2002 at 10:44:30AM +0100, Joe Thornber wrote:

> Curious, as far as I know people are using LVM1 snapshots
> successfully with xfs.

To be pretty correct: I used the VFS lock patch from LVM 1.1-rc2 with
LVM 1.0.4 because this doesn't give any rejects. But I had the problem
with previous LVM versions with it's own VFS lock patch, too.

At the moment I have two systems that cause trouble: One SuSE 7.0
based which is running about 2,5 years and one freshly installed
Debian Woody system. Both have different hardware, compiler, libc,
etc. Seems to be Kernel/LVM/XFS related.

Stephan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07  4:56       ` Stephan Austermuehle
@ 2002-06-07  5:08         ` Joe Thornber
  2002-06-07  5:11           ` Patrick Caulfield
  0 siblings, 1 reply; 14+ messages in thread
From: Joe Thornber @ 2002-06-07  5:08 UTC (permalink / raw)
  To: linux-lvm

On Fri, Jun 07, 2002 at 11:56:23AM +0200, Stephan Austermuehle wrote:
> On Fri, Jun 07, 2002 at 10:44:30AM +0100, Joe Thornber wrote:
> 
> > Curious, as far as I know people are using LVM1 snapshots
> > successfully with xfs.
> 
> To be pretty correct: I used the VFS lock patch from LVM 1.1-rc2 with
> LVM 1.0.4 because this doesn't give any rejects. But I had the problem
> with previous LVM versions with it's own VFS lock patch, too.
> 
> At the moment I have two systems that cause trouble: One SuSE 7.0
> based which is running about 2,5 years and one freshly installed
> Debian Woody system. Both have different hardware, compiler, libc,
> etc. Seems to be Kernel/LVM/XFS related.

It wouldn't surprise me if there were problems with the 1.1rc2 kernel
patch since snapshots have been changed in that release ... but you
are using 1.0.4.

- Joe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07  5:08         ` Joe Thornber
@ 2002-06-07  5:11           ` Patrick Caulfield
  2002-06-07  5:16             ` Stephan Austermuehle
  0 siblings, 1 reply; 14+ messages in thread
From: Patrick Caulfield @ 2002-06-07  5:11 UTC (permalink / raw)
  To: linux-lvm

On Fri, Jun 07, 2002 at 11:07:06AM +0100, Joe Thornber wrote:
> On Fri, Jun 07, 2002 at 11:56:23AM +0200, Stephan Austermuehle wrote:
> > On Fri, Jun 07, 2002 at 10:44:30AM +0100, Joe Thornber wrote:
> > 
> > > Curious, as far as I know people are using LVM1 snapshots
> > > successfully with xfs.
> > 
> > To be pretty correct: I used the VFS lock patch from LVM 1.1-rc2 with
> > LVM 1.0.4 because this doesn't give any rejects. But I had the problem
> > with previous LVM versions with it's own VFS lock patch, too.
> > 
> > At the moment I have two systems that cause trouble: One SuSE 7.0
> > based which is running about 2,5 years and one freshly installed
> > Debian Woody system. Both have different hardware, compiler, libc,
> > etc. Seems to be Kernel/LVM/XFS related.
> 
> It wouldn't surprise me if there were problems with the 1.1rc2 kernel
> patch since snapshots have been changed in that release ... but you
> are using 1.0.4.
 
I think he means the VFS lock patch in 1.1rc2 - which should be fine.
It might be a stack size issue ?

patrick

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07  5:11           ` Patrick Caulfield
@ 2002-06-07  5:16             ` Stephan Austermuehle
  0 siblings, 0 replies; 14+ messages in thread
From: Stephan Austermuehle @ 2002-06-07  5:16 UTC (permalink / raw)
  To: linux-lvm

On Fri, Jun 07, 2002 at 11:10:52AM +0100, Patrick Caulfield wrote:

> I think he means the VFS lock patch in 1.1rc2 - which should be
> fine.  It might be a stack size issue ?

I took only the VFS lock patch from 1.1rc2, everything else is from
1.0.4.

Stephan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07  4:45     ` Joe Thornber
  2002-06-07  4:56       ` Stephan Austermuehle
@ 2002-06-07 11:27       ` Adrian Head
  1 sibling, 0 replies; 14+ messages in thread
From: Adrian Head @ 2002-06-07 11:27 UTC (permalink / raw)
  To: linux-lvm

On Fri, 7 Jun 2002 19:44, Joe Thornber wrote:
> Curious, as far as I know people are using LVM1 snapshots successfully
> with xfs.

Yes there are some of us that are sucessfully using LVM snapshots & XFS.

There are a couple of interesting treads on the linux-xfs mailing list at the 
moment that may interest you.  One of them can be found here:
http://marc.theaimsgroup.com/?l=linux-xfs&m=102337866925770&w=2

-- 
Adrian Head

(Public Key available on request.)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [linux-lvm] How well tested is the snapshot feature?
@ 2002-06-07 10:30 Dale Stephenson
  2002-06-07 12:48 ` Joe Thornber
  0 siblings, 1 reply; 14+ messages in thread
From: Dale Stephenson @ 2002-06-07 10:30 UTC (permalink / raw)
  To: 'linux-lvm@sistina.com'; +Cc: 'linux-xfs@oss.sgi.com'

[-- Attachment #1: Type: text/plain, Size: 3010 bytes --]

I'm one of those people using LVM1 snapshots mostly successfully with XFS.
However:

1) I'm using a process flag hack to keep XFS's filesystem freeze from
snagging writes by kupdated, xfs_freeze, and fsync_dev/fsync_dev_lockfs.
I've also removed the xfs_unmountfs_writesb() call from xfs_thaw().  I'm
included the two patches I use, which apply against XFS 1.1.  I've directed
most of my discussion of this to the XFS list, as the locking mechanism is
filesystem-specific.  The only filesystem calls that the lvm driver makes
with the VFS patch are fsync_dev_lockfs() and unlockfs().  The COW activity
itself is done through brw_kiovec(), which I believe should not go through
the file system at all (though it seems to cause lots of indirect file
system activity through kupdated).

device-mapper (LVM2) uses (with VFS enhancement) the very same
fsync_dev_lockfs() and unlockfs() calls.  However, the COW activity is not
handled through brw_kiovec(), instead being transferred to device-mapper's
kcopyd.  I haven't worked with LVM2 yet, so it's certainly possible that
kcopyd allieviates the pressure on kupdated.  But in theory I would expect
it to be susceptible to the same file system deadlocks experienced by LVM1.


2) I'm still seeing an occasional xfs_freeze deadlock.
xfs_unmountfs_writesb() (from xfs_freeze) and kupdated get stuck on separate
pagebuf locks.  It occurs with multiple snapshots and streaming writes to
the snapshot source over both samba and nfs.

Using these patches, I haven't seen any oops problems, only the odd
deadlock.  You also have to mount the XFS snapshots with nouuid and
norecovery options.

Another approach that will work is to forget using the VFS patch entirely,
and use writeable snapshots (available for 1.1, LVM2, and working patches
have been posted to the list for 1.0.x).  If the snapshot is writeable you
can just mount with nouuid (or change the UUID of the snapshot), mount the
snapshot and let XFS recovery do its stuff.  This has the advantage of not
running into any locking problems whatsoever.  The disadvantage is that the
filesystem isn't clean, so instead of having a snapshot of the filesystem at
that point of time, you have the filesystem as if the power had suddenly
failed at that point in time.

Dale Stephenson
steph@snapserver.com

> -----Original Message-----
> From: Joe Thornber [mailto:joe@fib011235813.fsnet.co.uk]
> Sent: Friday, June 07, 2002 2:45 AM
> To: linux-lvm@sistina.com
> Subject: Re: [linux-lvm] How well tested is the snapshot feature?
> 
> 
> On Fri, Jun 07, 2002 at 11:37:05AM +0200, Stephan Austermuehle wrote:
> > On Fri, Jun 07, 2002 at 10:33:00AM +0100, Joe Thornber wrote:
> > 
> > > Did you apply the VFS patch from the 1.0.3 release ?  This makes
> > > sure that the filesystem flushes a consistent state to the disk
> > > before the snapshot is setup.
> > 
> > Do you mean the VFS lock patch? Yes, it is applied.
> 
> Curious, as far as I know people are using LVM1 snapshots successfully
> with xfs.
> 
> - Joe


[-- Attachment #2: no_freeze.patch --]
[-- Type: application/octet-stream, Size: 3274 bytes --]

--- linux.old/fs/xfs/xfs_log.c	Wed Mar 20 04:51:25 2002
+++ linux/fs/xfs/xfs_log.c	Sat Mar 30 00:38:58 2002
@@ -324,6 +324,11 @@
 	int	rval;
 	xlog_t *log = mp->m_log;
 
+	if (XFS_MTOVFS(mp)->vfs_flag & VFS_RDONLY) {
+		printk("ignoring xfs_log_force on a read-only filesystem\n");
+		return 0;
+	}
+
 #if defined(DEBUG) || defined(XLOG_NOLOG)
 	if (! xlog_debug && xlog_devt == log->l_dev)
 		return 0;
--- linux.old/fs/xfs/xfs_mount.c	Sat Mar  9 03:21:26 2002
+++ linux/fs/xfs/xfs_mount.c	Sat Mar 30 00:39:39 2002
@@ -1680,6 +1680,7 @@
 	int		level)
 {
 	int	s = mutex_spinlock(&mp->m_freeze_lock);
+	unsigned long flags;
 
 	mp->m_frozen = level;
 	mutex_spinunlock(&mp->m_freeze_lock, s);
@@ -1688,13 +1689,27 @@
 		while (atomic_read(&mp->m_active_trans) > 0)
 			delay(100);
 	}
+
+	flags = current->flags;
+	current->flags |= PF_NO_FREEZE;
+
+	/* make sure the log is written after we freeze */
+	xfs_log_force(mp, 0, XFS_LOG_FORCE|XFS_LOG_SYNC);
+
+	if (! (flags & PF_NO_FREEZE)) {
+		current->flags &= ~PF_NO_FREEZE;
+	}
 }
 
 void
 xfs_finish_freeze(
 	xfs_mount_t *mp)
 {
-	int	s = mutex_spinlock(&mp->m_freeze_lock);
+	int	s;
+
+	if (current->flags & PF_NO_FREEZE) return;
+
+	s = mutex_spinlock(&mp->m_freeze_lock);
 
 	if (mp->m_frozen) {
 		mp->m_frozen = 0;
@@ -1713,6 +1728,14 @@
 {
 	int	s;
 	int	do_lock = 0;
+
+	/* some processes must not freeze - eg. a lvcreate or kupdated, otherwise
+	   lvcreate locks solid as it tries to flush blocks, and that gets here */
+	if (current->flags & PF_NO_FREEZE)  {
+		if (level == XFS_FREEZE_TRANS)
+			atomic_inc(&mp->m_active_trans);
+		return;
+	}
 
 	if (!mp->m_frozen) {
 		if (level == XFS_FREEZE_TRANS)
--- linux.old/fs/buffer.c	Sat Mar 30 00:28:28 2002
+++ linux/fs/buffer.c	Sat Mar 30 00:40:54 2002
@@ -392,6 +392,14 @@
 
 int fsync_dev(kdev_t dev)
 {
+	int ret;
+	unsigned long flags;
+
+	flags = current->flags;
+	/* we set this flag to prevent the XFS pagebuf code causing a deadlock 
+	   during a sync - a frozen filesystem should freeze only new IO, not
+	   existing data waiting to be flushed */
+	current->flags |= PF_NO_FREEZE;
 	sync_buffers(dev, 0);
 
 	lock_kernel();
@@ -400,7 +408,13 @@
 	sync_supers(dev);
 	unlock_kernel();
 
-	return sync_buffers(dev, 1);
+	ret = sync_buffers(dev, 1);
+
+	if (! (flags & PF_NO_FREEZE)) {
+		current->flags &= ~PF_NO_FREEZE;
+	}
+
+	return ret;
 }
 
 /*
@@ -3038,6 +3052,9 @@
 	siginitsetinv(&current->blocked, sigmask(SIGCONT) | sigmask(SIGSTOP));
 	recalc_sigpending(tsk);
 	spin_unlock_irq(&tsk->sigmask_lock);
+
+	/* kupdated can also do IO of old blocks on a frozen filesystem */
+	current->flags |= PF_NO_FREEZE;
 
 	complete((struct completion *)startup);
 
--- linux.old/include/linux/sched.h	Sat Mar 30 00:30:23 2002
+++ linux/include/linux/sched.h	Fri Mar 29 23:34:13 2002
@@ -428,6 +428,7 @@
 #define PF_FREE_PAGES	0x00002000	/* per process page freeing */
 #define PF_NOIO		0x00004000	/* avoid generating further I/O */
 #define PF_FSTRANS	0x00008000	/* inside a filesystem transaction */
+#define PF_NO_FREEZE	0x01000000	/* ignore fs freeze flag */
 
 #define PF_USEDFPU	0x00100000	/* task used FPU this quantum (SMP) */
 

[-- Attachment #3: no_freeze_lockfs.patch --]
[-- Type: application/octet-stream, Size: 2173 bytes --]

--- linux/fs/buffer.c.old	Tue Apr  2 10:47:30 2002
+++ linux/fs/buffer.c	Tue Apr  2 10:48:35 2002
@@ -428,6 +428,9 @@
 
 int fsync_dev_lockfs(kdev_t dev)
 {
+	unsigned long flags;
+	int ret;
+
 	/* you are not allowed to try locking all the filesystems
 	** on the system, your chances of getting through without
 	** total deadlock are slim to none.
@@ -435,6 +438,9 @@
 	if (!dev)
 		return fsync_dev(dev) ;
 
+	flags = current->flags;
+	current->flags |= PF_NO_FREEZE;
+
 	sync_buffers(dev, 0);
 
 	lock_kernel();
@@ -451,7 +457,13 @@
 	sync_supers_lockfs(dev) ;
 	unlock_kernel();
 
-	return sync_buffers(dev, 1) ;
+	ret = sync_buffers(dev, 1) ;
+
+	if (! (flags & PF_NO_FREEZE)) {
+		current->flags &= ~PF_NO_FREEZE;
+	}
+
+	return ret;
 }
 
 asmlinkage long sys_sync(void)
@@ -1171,13 +1183,23 @@
 void balance_dirty(void)
 {
 	int state = balance_dirty_state();
+	unsigned long flags;
 
 	if (state < 0)
 		return;
 
 	/* If we're getting into imbalance, start write-out */
 	spin_lock(&lru_list_lock);
+
+	flags = current->flags;
+	current->flags |= PF_NO_FREEZE;
+
 	write_some_buffers(NODEV);
+
+	if (! (flags & PF_NO_FREEZE)) {
+		current->flags &= ~PF_NO_FREEZE;
+	}
+
 
 	/*
 	 * And if we're _really_ out of balance, wait for
--- linux/fs/xfs/xfs_fsops.c.old	Fri Apr 26 18:10:29 2002
+++ linux/fs/xfs/xfs_fsops.c	Fri Apr 26 18:13:16 2002
@@ -551,6 +551,7 @@
 	vfs_t		*vfsp;
 	/*REFERENCED*/
 	int		error;
+	unsigned long flags;
 
 	vfsp = XFS_MTOVFS(mp);
 
@@ -565,6 +566,10 @@
 
 	/* Pause transaction subsystem */
 	xfs_start_freeze(mp, XFS_FREEZE_TRANS);
+	
+	/* don't freeze the freeze */
+	flags = current->flags;
+	current->flags |= PF_NO_FREEZE;
 
 	/* Flush any remaining inodes into buffers */
 	VFS_SYNC(vfsp, SYNC_ATTR|SYNC_WAIT, sys_cred, error);
@@ -579,6 +584,10 @@
 	xfs_log_unmount_write(mp);
 	xfs_unmountfs_writesb(mp);
 
+	if (! (flags & PF_NO_FREEZE)) {
+		current->flags &= ~PF_NO_FREEZE;
+	}
+
 	return 0;
 }
 
@@ -587,7 +596,6 @@
 xfs_fs_thaw(
 	xfs_mount_t	*mp)
 {
-	xfs_unmountfs_writesb(mp);
 	xfs_finish_freeze(mp);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07 10:30 Dale Stephenson
@ 2002-06-07 12:48 ` Joe Thornber
  0 siblings, 0 replies; 14+ messages in thread
From: Joe Thornber @ 2002-06-07 12:48 UTC (permalink / raw)
  To: linux-lvm; +Cc: 'linux-xfs@oss.sgi.com'

On Fri, Jun 07, 2002 at 08:35:44AM -0700, Dale Stephenson wrote:
> device-mapper (LVM2) uses (with VFS enhancement) the very same
> fsync_dev_lockfs() and unlockfs() calls.  However, the COW activity is not
> handled through brw_kiovec(), instead being transferred to device-mapper's
> kcopyd.  I haven't worked with LVM2 yet, so it's certainly possible that
> kcopyd allieviates the pressure on kupdated.  But in theory I would expect
> it to be susceptible to the same file system deadlocks experienced by LVM1.

I'm not sure what this kupdated interaction that you mention could be.
Both brw_kiovec and kcopyd stay well away from both the filesystem
and the buffer cache.

> 2) I'm still seeing an occasional xfs_freeze deadlock.
> xfs_unmountfs_writesb() (from xfs_freeze) and kupdated get stuck on separate
> pagebuf locks.  It occurs with multiple snapshots and streaming writes to
> the snapshot source over both samba and nfs.

Which kernel are you using ?  I've found that 2.4.18 can be easily
persuaded to deadlock by having two processes making GFP_NOIO requests
for memory whilst the system is short of free memory. 2.4.19-pre9
works fine.

- Joe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [linux-lvm] How well tested is the snapshot feature?
@ 2002-06-07 13:26 Dale Stephenson
  2002-06-08  4:00 ` Joe Thornber
  0 siblings, 1 reply; 14+ messages in thread
From: Dale Stephenson @ 2002-06-07 13:26 UTC (permalink / raw)
  To: 'linux-lvm@sistina.com'; +Cc: 'linux-xfs@oss.sgi.com'

Joe Thornber writes:
> On Fri, Jun 07, 2002 at 08:35:44AM -0700, Dale Stephenson wrote:
> > device-mapper (LVM2) uses (with VFS enhancement) the very same
> > fsync_dev_lockfs() and unlockfs() calls.  However, the COW 
> activity is not
> > handled through brw_kiovec(), instead being transferred to 
> device-mapper's
> > kcopyd.  I haven't worked with LVM2 yet, so it's certainly 
> possible that
> > kcopyd allieviates the pressure on kupdated.  But in theory 
> I would expect
> > it to be susceptible to the same file system deadlocks 
> experienced by LVM1.
> 
> I'm not sure what this kupdated interaction that you mention could be.
> Both brw_kiovec and kcopyd stay well away from both the filesystem
> and the buffer cache.
> 
Subjective impression.  kupdated always seems to be in D state with
streaming writes and snapshots, more so than a similar stream directed at
LVM + XFS without snapshots.  While brw_kiovec and kcopyd stay away from the
filesystem, the filesystem doesn't stay away from them!  When kupdated
writes out something to a LV with multiple snapshots, multiple COW can
occur.  Since with device-mapper the COW is done by a separate process
(kcopyd), I'd expect kupdated to not spend so much time in D.  Plus
device-mapper's supposed to be faster. 

But when I say I expect "it" to be susceptible I'm talking about the system,
NOT the COW activity.  I really haven't had a problem with a thread getting
stuck while trying to do COW.  The problem I'm seeing now is with
xfs_unmountfs_writesb() as called from xfs_fs_freeze().  I've only seen the
problem with (multiple) snapshots, but brw_kiovec() isn't involved in the
deadlock and fsync_dev_lockfs() is.  So I would expect LVM2 (device-mapper)
to be susceptible to the same problem, at least in theory.

> > 2) I'm still seeing an occasional xfs_freeze deadlock.
> > xfs_unmountfs_writesb() (from xfs_freeze) and kupdated get 
> stuck on separate
> > pagebuf locks.  It occurs with multiple snapshots and 
> streaming writes to
> > the snapshot source over both samba and nfs.
> 
> Which kernel are you using ?  I've found that 2.4.18 can be easily
> persuaded to deadlock by having two processes making GFP_NOIO requests
> for memory whilst the system is short of free memory. 2.4.19-pre9
> works fine.
> 
2.4.18.  I've been able to induce memory deadlocks (processes in D state
descending from alloc_pages) on my 64K box with multiple snapshots, but
haven't been too worried about that since I expect it.  On a 1 GB system I
haven't seen the deadlocks, or at least recognized it as such.  The one I'm
seeing has a ton of writing processes waiting on check_frozen (which is
fine), kupdated stuck on pagebuf_lock(), and xfs_freeze waiting on
_pagebuf_wait_unpin().  Is this something you've seen?

I hope to have this tested on 2.4.19-pre10 Real Soon Now.

Dale Stephenson
steph@snapserver.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-07 13:26 Dale Stephenson
@ 2002-06-08  4:00 ` Joe Thornber
  2002-06-08  4:11   ` Joe Thornber
  0 siblings, 1 reply; 14+ messages in thread
From: Joe Thornber @ 2002-06-08  4:00 UTC (permalink / raw)
  To: linux-lvm; +Cc: 'linux-xfs@oss.sgi.com'

On Fri, Jun 07, 2002 at 11:31:30AM -0700, Dale Stephenson wrote:
> Subjective impression.  kupdated always seems to be in D state with
> streaming writes and snapshots, more so than a similar stream directed at
> LVM + XFS without snapshots.  While brw_kiovec and kcopyd stay away from the
> filesystem, the filesystem doesn't stay away from them!  When kupdated
> writes out something to a LV with multiple snapshots, multiple COW can
> occur.

The big weakness of snapshots in LVM1 and EVMS is that they perform
the copy on write exception synchronously.  ie. If a process schedules
a lot of writes to a device (eg, kupdate), and these writes trigger a
lot of exceptions, the exceptions will be performed one after the
other.  So if you are using an 8k chunk size for each exception (small
chunks sizes eliminate redundant copying), and kupdate triggers 1M of
exceptions LVM1 and EVMS will perform the following steps:

1) Issue read of original chunk
2) wait
3) issue write
4) wait

And it will do it for *every* chunk, 128 times in this case.  So
that's 256 times in total that the original process spends waiting for
the disk.  No wonder you see kupdate in the 'D' state.

In order to combat this effect you will be forced to use larger chunk
sizes in the hope that most of these exceptions are to adjacent parts
of the disk.

With device-mapper if an exception is triggered it is immediately
handed to kcopyd, and then device-mapper carrys on servicing
subsequent requests.  Typically queuing more and more exceptions with
kcopyd.  Kcopyd tries to perform as many of these copies at once,
which gives us two major benefits.

i) The read for one exception can occur at the same time as the write
   for another.  Assuming the COW store and the origin are on seperate
   PVs on average this reduces the overhead of performing an exception
   by a half.

ii) There is no uneccessary waiting !  This waiting is readily
    apparent in the graph on

    http://people.sistina.com/~thornber/snap_performance.html

    Since this benchmark is based on dbench which just creating and
    removing v. large files it is advantageous to LVM1/EVMS since
    there will be little redundant copying when they use large chunk
    sizes.  It would be interesting to use a benchmark that touches lots
    of little files scattered over a huge filesystem - that would at
    least highlight the inefficiency of copying 512k when a 1k file is
    touched.

So with LVM2 people are encouraged to use small chunk sizes to avoid
redundant copying.

> The problem I'm seeing now is with
> xfs_unmountfs_writesb() as called from xfs_fs_freeze().  I've only seen the
> problem with (multiple) snapshots, but brw_kiovec() isn't involved in the
> deadlock and fsync_dev_lockfs() is.  So I would expect LVM2 (device-mapper)
> to be susceptible to the same problem, at least in theory.

Yes, it sounds like a bug in xfs.

> 2.4.18.  I've been able to induce memory deadlocks (processes in D state
> descending from alloc_pages) on my 64K box with multiple snapshots, but
> haven't been too worried about that since I expect it.  On a 1 GB system I
> haven't seen the deadlocks, or at least recognized it as such.  The one I'm
> seeing has a ton of writing processes waiting on check_frozen (which is
> fine), kupdated stuck on pagebuf_lock(), and xfs_freeze waiting on
> _pagebuf_wait_unpin().  Is this something you've seen?

No, the deadlocks I've seen seemed to involve a thread staying
permanently in the rebalance loop in __alloc_pages.

- Joe

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [linux-lvm] How well tested is the snapshot feature?
  2002-06-08  4:00 ` Joe Thornber
@ 2002-06-08  4:11   ` Joe Thornber
  0 siblings, 0 replies; 14+ messages in thread
From: Joe Thornber @ 2002-06-08  4:11 UTC (permalink / raw)
  To: linux-lvm; +Cc: 'linux-xfs@oss.sgi.com'

On Sat, Jun 08, 2002 at 09:59:31AM +0100, Joe Thornber wrote:
> exceptions LVM1 and EVMS will perform the following steps:
> 
> 1) Issue read of original chunk
> 2) wait
> 3) issue write
> 4) wait

I forgot about the metadata update:

6) issue metadata write
7) wait

device-mapper batches the metadata updates, under load this amortises
the cost away (well, to a point where I can't measure it).

- Joe

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2002-06-08  4:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-07  4:27 [linux-lvm] How well tested is the snapshot feature? Stephan Austermuehle
2002-06-07  4:33 ` Joe Thornber
2002-06-07  4:37   ` Stephan Austermuehle
2002-06-07  4:45     ` Joe Thornber
2002-06-07  4:56       ` Stephan Austermuehle
2002-06-07  5:08         ` Joe Thornber
2002-06-07  5:11           ` Patrick Caulfield
2002-06-07  5:16             ` Stephan Austermuehle
2002-06-07 11:27       ` Adrian Head
  -- strict thread matches above, loose matches on Subject: below --
2002-06-07 10:30 Dale Stephenson
2002-06-07 12:48 ` Joe Thornber
2002-06-07 13:26 Dale Stephenson
2002-06-08  4:00 ` Joe Thornber
2002-06-08  4:11   ` Joe Thornber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.