linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
       [not found]   ` <77260000.1002507094@tiny>
@ 2001-10-08 15:53     ` Ed Tomlinson
  2001-10-08 16:54       ` [linux-lvm] " Chris Mason
  0 siblings, 1 reply; 12+ messages in thread
From: Ed Tomlinson @ 2001-10-08 15:53 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-lvm

Hi Chris,

To summerize, with 2.4.11-pre5 and lvm-1.01-rc4 + your test vfslocking patch 
for 2.4.11-pre I, unlike you, was able to make snapshots and reiserfsck them without 
problems.  Then I tried a debench 50 run and got an oops.

ksymoops 2.4.3 on i586 2.4.10-e1.  Options used
     -V (default)
     -k 20011008092859.ksyms (specified)
     -l 20011008092859.modules (specified)
     -o /lib/modules/2.4.11-pre5 (specified)
     -m /boot/System.map-2.4.11-pre5 (specified)

lvm -- giving up to snapshot /dev/lv/root on /dev/lv/snap: out of space
Unable to handle kernel NULL pointer dereference at virtual address 00000d68
d680a45c
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<d680a45c>]    Tainted: P 
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 00001fff   ebx: 00000d68   ecx: d36dac00   edx: 00000000
esi: 00105670   edi: 00002101   ebp: 00000000   esp: d220bdf8
ds: 0018   es: 0018   ss: 0018
Process syslogd (pid: 219, stackpage=d220b000)
Stack: d36dac00 d36dad70 d36da770 00e00000 00e00000 000001ad 00000058 00000000 
       d68075c0 d220be56 d220be58 00104470 d36dac00 00003a01 c1f6dd40 00005258 
       00e00000 c029ef28 c029eec0 d36da600 d36d1000 00104470 00000020 2101b7e0 
Call Trace: [<d68075c0>] [<d6807695>] [<c0185b4c>] [<c0185bb1>] [<c0185d07>] 
   [<c0164e69>] [<c01683ee>] [<c0167647>] [<c0155eac>] [<c012d8ce>] [<c0106d53>] 
Code: 8b 0b eb 03 45 8b 09 39 d9 74 27 39 71 08 75 f4 66 39 79 0c 

>>EIP; d680a45c <[lvm-mod]lvm_snapshot_remap_block+70/d4>   <=====
Trace; d68075c0 <[lvm-mod]lvm_map+3b0/478>
Trace; d6807694 <[lvm-mod]lvm_make_request_fn+c/1c>
Trace; c0185b4c <generic_make_request+130/140>
Trace; c0185bb0 <submit_bh+54/70>
Trace; c0185d06 <ll_rw_block+13a/1a0>
Trace; c0164e68 <flush_commit_list+208/388>
Trace; c01683ee <do_journal_end+746/9ec>
Trace; c0167646 <journal_end_sync+12/18>
Trace; c0155eac <reiserfs_sync_file+80/9c>
Trace; c012d8ce <sys_fsync+5e/8c>
Trace; c0106d52 <system_call+32/40>
Code;  d680a45c <[lvm-mod]lvm_snapshot_remap_block+70/d4>
00000000 <_EIP>:
Code;  d680a45c <[lvm-mod]lvm_snapshot_remap_block+70/d4>   <=====
   0:   8b 0b                     mov    (%ebx),%ecx   <=====
Code;  d680a45e <[lvm-mod]lvm_snapshot_remap_block+72/d4>
   2:   eb 03                     jmp    7 <_EIP+0x7> d680a462 <[lvm-mod]lvm_snapshot_remap_block+76/d4>
Code;  d680a460 <[lvm-mod]lvm_snapshot_remap_block+74/d4>
   4:   45                        inc    %ebp
Code;  d680a460 <[lvm-mod]lvm_snapshot_remap_block+74/d4>
   5:   8b 09                     mov    (%ecx),%ecx
Code;  d680a462 <[lvm-mod]lvm_snapshot_remap_block+76/d4>
   7:   39 d9                     cmp    %ebx,%ecx
Code;  d680a464 <[lvm-mod]lvm_snapshot_remap_block+78/d4>
   9:   74 27                     je     32 <_EIP+0x32> d680a48e <[lvm-mod]lvm_snapshot_remap_block+a2/d4>
Code;  d680a466 <[lvm-mod]lvm_snapshot_remap_block+7a/d4>
   b:   39 71 08                  cmp    %esi,0x8(%ecx)
Code;  d680a46a <[lvm-mod]lvm_snapshot_remap_block+7e/d4>
   e:   75 f4                     jne    4 <_EIP+0x4> d680a460 <[lvm-mod]lvm_snapshot_remap_block+74/d4>
Code;  d680a46c <[lvm-mod]lvm_snapshot_remap_block+80/d4>
  10:   66 39 79 0c               cmp    %di,0xc(%ecx)

Suspect there is a problem with snapshots filling in 2.4.11-pre with lvm 1.01-rc4 

Ed Tomlinson

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-08 15:53     ` [linux-lvm] [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre) Ed Tomlinson
@ 2001-10-08 16:54       ` Chris Mason
  2001-10-08 17:05         ` Ed Tomlinson
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Mason @ 2001-10-08 16:54 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-lvm


On Monday, October 08, 2001 11:53:46 AM -0400 Ed Tomlinson
<tomlins@CAM.ORG> wrote:

> Hi Chris,
> 
> To summerize, with 2.4.11-pre5 and lvm-1.01-rc4 + your test vfslocking
> patch  for 2.4.11-pre I, unlike you, was able to make snapshots and
> reiserfsck them without  problems.  Then I tried a debench 50 run and got
> an oops.

Confirmed, similar oops here.  Do you get this with rc4 in 2.4.10?

-chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-08 16:54       ` [linux-lvm] " Chris Mason
@ 2001-10-08 17:05         ` Ed Tomlinson
  2001-10-08 17:40           ` Chris Mason
  2001-10-08 19:51           ` Chris Mason
  0 siblings, 2 replies; 12+ messages in thread
From: Ed Tomlinson @ 2001-10-08 17:05 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-lvm

On October 8, 2001 12:54 pm, Chris Mason wrote:
> On Monday, October 08, 2001 11:53:46 AM -0400 Ed Tomlinson
>
> <tomlins@CAM.ORG> wrote:
> > Hi Chris,
> >
> > To summerize, with 2.4.11-pre5 and lvm-1.01-rc4 + your test vfslocking
> > patch  for 2.4.11-pre I, unlike you, was able to make snapshots and
> > reiserfsck them without  problems.  Then I tried a debench 50 run and got
> > an oops.
>
> Confirmed, similar oops here.  Do you get this with rc4 in 2.4.10?

Funny you should ask...  Guess what was runing when this email arrived?  The 
answer is that with 2.4.10 + lvm 1.01-rc4 the snapshot deactivates correctly - 
I do not see an oops.

Ed Tomlinson

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-08 17:05         ` Ed Tomlinson
@ 2001-10-08 17:40           ` Chris Mason
  2001-10-08 19:51           ` Chris Mason
  1 sibling, 0 replies; 12+ messages in thread
From: Chris Mason @ 2001-10-08 17:40 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-lvm


On Monday, October 08, 2001 01:05:48 PM -0400 Ed Tomlinson <tomlins@CAM.ORG> wrote:

> On October 8, 2001 12:54 pm, Chris Mason wrote:
>> On Monday, October 08, 2001 11:53:46 AM -0400 Ed Tomlinson
>> 
>> <tomlins@CAM.ORG> wrote:
>> > Hi Chris,
>> > 
>> > To summerize, with 2.4.11-pre5 and lvm-1.01-rc4 + your test vfslocking
>> > patch  for 2.4.11-pre I, unlike you, was able to make snapshots and
>> > reiserfsck them without  problems.  Then I tried a debench 50 run and
>> > got an oops.
>> 
>> Confirmed, similar oops here.  Do you get this with rc4 in 2.4.10?
> 
> Funny you should ask...  Guess what was runing when this email arrived?
> The  answer is that with 2.4.10 + lvm 1.01-rc4 the snapshot deactivates
> correctly -  I do not see an oops.
>

Hmmm, looks like pure luck then.  It does not look like all the callers of lvm_snapshot_remap_block are properly checking to make sure the snapshot is still valid (hasn't been run through lvm_snapshot_release).

Try this:

-chris

Index: 0.21/drivers/md/lvm-snap.c
--- 0.21/drivers/md/lvm-snap.c Sat, 06 Oct 2001 00:07:22 -0400 
+++ 0.21(w)/drivers/md/lvm-snap.c Mon, 08 Oct 2001 13:35:41 -0400 
@@ -108,6 +108,9 @@
 	lv_block_exception_t * ret;
 	int i = 0;
 
+	if (!hash_table || !lv->lv_block_exception)
+		return NULL ;
+
 	hash_table = &hash_table[hashfn(org_dev, org_start, mask, chunk_size)];
 	ret = NULL;
 	for (next = hash_table->next; next != hash_table; next = next->next)
@@ -140,6 +143,8 @@
 	unsigned long mask = lv->lv_snapshot_hash_mask;
 	int chunk_size = lv->lv_chunk_size;
 
+	if (!hash_table)
+		BUG() ;
 	hash_table = &hash_table[hashfn(org_dev, org_start, mask, chunk_size)];
 	list_add(&exception->hash, hash_table);
 }
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-08 17:05         ` Ed Tomlinson
  2001-10-08 17:40           ` Chris Mason
@ 2001-10-08 19:51           ` Chris Mason
  2001-10-09  1:57             ` Ed Tomlinson
  1 sibling, 1 reply; 12+ messages in thread
From: Chris Mason @ 2001-10-08 19:51 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-lvm


On Monday, October 08, 2001 01:05:48 PM -0400 Ed Tomlinson <tomlins@CAM.ORG> wrote:

> On October 8, 2001 12:54 pm, Chris Mason wrote:
>> On Monday, October 08, 2001 11:53:46 AM -0400 Ed Tomlinson
>> 
>> <tomlins@CAM.ORG> wrote:
>> > Hi Chris,
>> > 
>> > To summerize, with 2.4.11-pre5 and lvm-1.01-rc4 + your test vfslocking
>> > patch  for 2.4.11-pre I, unlike you, was able to make snapshots and
>> > reiserfsck them without  problems.  Then I tried a debench 50 run and got
>> > an oops.
>> 
>> Confirmed, similar oops here.  Do you get this with rc4 in 2.4.10?
> 
> Funny you should ask...  Guess what was runing when this email arrived?  The 
> answer is that with 2.4.10 + lvm 1.01-rc4 the snapshot deactivates correctly - 
> I do not see an oops.

Ok, that first patch won't quite fix it, as we can still oops
in lvm_snapshot_COW.  This one works better for me:

--- 0.21/drivers/md/lvm.c Sun, 07 Oct 2001 22:15:54 -0400 
+++ 0.21(w)/drivers/md/lvm.c Mon, 08 Oct 2001 15:54:42 -0400 
@@ -1142,7 +1142,8 @@
 
 	/* we must redo lvm_snapshot_remap_block in order to avoid a
 	   race condition in the gap where no lock was held */
-	if (!lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv) &&
+	if (lv->lv_block_exception && 
+	    !lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv) &&
 	    !lvm_snapshot_COW(rdev, rsector, pe_start, rsector, vg, lv))
 		lvm_write_COW_table_block(vg, lv);
 
@@ -1151,11 +1152,12 @@
 
 static inline void _remap_snapshot(kdev_t rdev, ulong rsector,
 				   ulong pe_start, lv_t *lv, vg_t *vg) {
-	int r;
+	int r = 0;
 
 	/* check to see if this chunk is already in the snapshot */
 	down_read(&lv->lv_lock);
-	r = lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv);
+	if (lv->lv_block_exception)
+		r = lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv);
 	up_read(&lv->lv_lock);
 
 	if (!r)
Index: 0.21/drivers/md/lvm-snap.c
--- 0.21/drivers/md/lvm-snap.c Sat, 06 Oct 2001 00:07:22 -0400 root (linux/i/c/38_lvm-snap.c 1.1.2.1.2.1 644)
+++ 0.21(w)/drivers/md/lvm-snap.c Mon, 08 Oct 2001 15:13:10 -0400 root (linux/i/c/38_lvm-snap.c 1.1.2.1.2.1 644)
@@ -140,6 +140,8 @@
 	unsigned long mask = lv->lv_snapshot_hash_mask;
 	int chunk_size = lv->lv_chunk_size;
 
+	if (!hash_table)
+		BUG() ;
 	hash_table = &hash_table[hashfn(org_dev, org_start, mask, chunk_size)];
 	list_add(&exception->hash, hash_table);
 }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-08 19:51           ` Chris Mason
@ 2001-10-09  1:57             ` Ed Tomlinson
  2001-10-09  2:29               ` Chris Mason
  0 siblings, 1 reply; 12+ messages in thread
From: Ed Tomlinson @ 2001-10-09  1:57 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-lvm

Hi Chris

> Ok, that first patch won't quite fix it, as we can still oops
> in lvm_snapshot_COW.  This one works better for me:

Looks like this one has problems too.  Here is the oops I get with it:

ksymoops 2.4.3 on i586 2.4.10-e1.  Options used
     -V (default)
     -k 20011008181015.ksyms (specified)
     -l 20011008181015.modules (specified)
     -o /lib/modules/2.4.11-pre5 (specified)
     -m /boot/System.map-2.4.11-pre5 (specified)

Unable to handle kernel NULL pointer dereference at virtual address 00000008
d680b179
*pde = 00000000
Oops: 0002
CPU:    0
EIP:    0010:[<d680b179>]    Tainted: P 
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: 00000000   ebx: d36e4000   ecx: d36e4000   edx: d36e0600
esi: d36e0600   edi: d36e0600   ebp: c4aa3d3c   esp: c4aa2cb8
ds: 0018   es: 0018   ss: 0018
Process dbench (pid: 864, stackpage=c4aa3000)
Stack: c15713a0 d680a50a d36e4000 d36e0600 00003a04 00000000 00000627 0170a470 
       d36e0600 c4aa3d3c d680aafd d36e4000 d36e0600 d680d3cb d36e0600 0170a470 
       d36e4000 00000000 00000002 000003ff 00000be4 00010000 c4aa2d38 00000400 
Call Trace: [<d680a50a>] [<d680aafd>] [<d680d3cb>] [<c01503d1>] [<c015d9e1>] 
   [<c014fcec>] [<c0158cbf>] [<c0159543>] [<c015036e>] [<c01503d1>] [<c015e32f>] 
   [<c015d6bf>] [<c01578ae>] [<c0157c18>] [<c01503d1>] [<c015dc6b>] [<c014ff11>] 
   [<c0158cbf>] [<c0159543>] [<c015036e>] [<c018a8ed>] [<c018a943>] [<c01629a3>] 
   [<c012e1dc>] [<c015fef7>] [<c0160700>] [<c010ffba>] [<d68071e5>] [<d68075f3>] 
   [<d6807695>] [<c0185b4c>] [<c0185bb1>] [<c012d4ab>] [<c012d552>] [<c012e25b>] 
   [<c012ebaf>] [<c012f0f7>] [<c0155975>] [<c0152f34>] [<c0122e42>] [<c012c86a>] 
   [<c0106d53>] 
Code: c7 40 08 01 00 00 00 89 e0 50 6a 00 52 51 e8 78 fc ff ff 83 

>>EIP; d680b178 <[lvm-mod]_disable_snapshot+10/44>   <=====
Trace; d680a50a <[lvm-mod]lvm_drop_snapshot+22/94>
Trace; d680aafc <[lvm-mod]lvm_snapshot_COW+3b4/3f4>
Trace; d680d3ca <[lvm-mod]lvm_name+8c2/e76>
Trace; c01503d0 <do_balance_mark_leaf_dirty+54/64>
Trace; c015d9e0 <leaf_insert_into_buf+23c/248>
Trace; c014fcec <balance_leaf+2210/24e8>
Trace; c0158cbe <reiserfs_kfree+12/38>
Trace; c0159542 <unfix_nodes+146/154>
Trace; c015036e <do_balance+ea/f8>
Trace; c01503d0 <do_balance_mark_leaf_dirty+54/64>
Trace; c015e32e <leaf_delete_items_entirely+1b2/1c0>
Trace; c015d6be <leaf_delete_items+5a/140>
Trace; c01578ae <get_parents+1aa/1c0>
Trace; c0157c18 <ip_check_balance+354/aac>
Trace; c01503d0 <do_balance_mark_leaf_dirty+54/64>
Trace; c015dc6a <leaf_paste_in_buffer+27e/28c>
Trace; c014ff10 <balance_leaf+2434/24e8>
Trace; c0158cbe <reiserfs_kfree+12/38>
Trace; c0159542 <unfix_nodes+146/154>
Trace; c015036e <do_balance+ea/f8>
Trace; c018a8ec <start_request+130/1f4>
Trace; c018a942 <start_request+186/1f4>
Trace; c01629a2 <reiserfs_paste_into_item+86/e0>
Trace; c012e1dc <getblk+18/40>
Trace; c015fef6 <is_tree_node+36/54>
Trace; c0160700 <search_by_key+7ec/c44>
Trace; c010ffba <schedule+256/384>
Trace; d68071e4 <[lvm-mod]__remap_snapshot+5c/88>
Trace; d68075f2 <[lvm-mod]lvm_map+3e2/478>
Trace; d6807694 <[lvm-mod]lvm_make_request_fn+c/1c>
Trace; c0185b4c <generic_make_request+130/140>
Trace; c0185bb0 <submit_bh+54/70>
Trace; c012d4aa <write_locked_buffers+1e/28>
Trace; c012d552 <write_some_buffers+9e/110>
Trace; c012e25a <balance_dirty+12/30>
Trace; c012ebae <__block_commit_write+a2/c0>
Trace; c012f0f6 <generic_commit_write+32/5c>
Trace; c0155974 <reiserfs_commit_write+30/a8>
Trace; c0152f34 <reiserfs_get_block+0/ca0>
Trace; c0122e42 <generic_file_write+4a6/5ac>
Trace; c012c86a <sys_write+8e/c4>
Trace; c0106d52 <system_call+32/40>
Code;  d680b178 <[lvm-mod]_disable_snapshot+10/44>
00000000 <_EIP>:
Code;  d680b178 <[lvm-mod]_disable_snapshot+10/44>   <=====
   0:   c7 40 08 01 00 00 00      movl   $0x1,0x8(%eax)   <=====
Code;  d680b17e <[lvm-mod]_disable_snapshot+16/44>
   7:   89 e0                     mov    %esp,%eax
Code;  d680b180 <[lvm-mod]_disable_snapshot+18/44>
   9:   50                        push   %eax
Code;  d680b182 <[lvm-mod]_disable_snapshot+1a/44>
   a:   6a 00                     push   $0x0
Code;  d680b184 <[lvm-mod]_disable_snapshot+1c/44>
   c:   52                        push   %edx
Code;  d680b184 <[lvm-mod]_disable_snapshot+1c/44>
   d:   51                        push   %ecx
Code;  d680b186 <[lvm-mod]_disable_snapshot+1e/44>
   e:   e8 78 fc ff ff            call   fffffc8b <_EIP+0xfffffc8b> d680ae02 <[lvm-mod]lvm_snapshot_release+b6/b8>
Code;  d680b18a <[lvm-mod]_disable_snapshot+22/44>
  13:   83 00 00                  addl   $0x0,(%eax)

Needed to reiserfsck with --fix-fixable after this one...

Hope this helps.

Ed


> --- 0.21/drivers/md/lvm.c Sun, 07 Oct 2001 22:15:54 -0400
> +++ 0.21(w)/drivers/md/lvm.c Mon, 08 Oct 2001 15:54:42 -0400
> @@ -1142,7 +1142,8 @@
>
>  	/* we must redo lvm_snapshot_remap_block in order to avoid a
>  	   race condition in the gap where no lock was held */
> -	if (!lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv) &&
> +	if (lv->lv_block_exception &&
> +	    !lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv) &&
>  	    !lvm_snapshot_COW(rdev, rsector, pe_start, rsector, vg, lv))
>  		lvm_write_COW_table_block(vg, lv);
>
> @@ -1151,11 +1152,12 @@
>
>  static inline void _remap_snapshot(kdev_t rdev, ulong rsector,
>  				   ulong pe_start, lv_t *lv, vg_t *vg) {
> -	int r;
> +	int r = 0;
>
>  	/* check to see if this chunk is already in the snapshot */
>  	down_read(&lv->lv_lock);
> -	r = lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv);
> +	if (lv->lv_block_exception)
> +		r = lvm_snapshot_remap_block(&rdev, &rsector, pe_start, lv);
>  	up_read(&lv->lv_lock);
>
>  	if (!r)
> Index: 0.21/drivers/md/lvm-snap.c
> --- 0.21/drivers/md/lvm-snap.c Sat, 06 Oct 2001 00:07:22 -0400 root
> (linux/i/c/38_lvm-snap.c 1.1.2.1.2.1 644) +++ 0.21(w)/drivers/md/lvm-snap.c
> Mon, 08 Oct 2001 15:13:10 -0400 root (linux/i/c/38_lvm-snap.c 1.1.2.1.2.1
> 644) @@ -140,6 +140,8 @@
>  	unsigned long mask = lv->lv_snapshot_hash_mask;
>  	int chunk_size = lv->lv_chunk_size;
>
> +	if (!hash_table)
> +		BUG() ;
>  	hash_table = &hash_table[hashfn(org_dev, org_start, mask, chunk_size)];
>  	list_add(&exception->hash, hash_table);
>  }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-09  1:57             ` Ed Tomlinson
@ 2001-10-09  2:29               ` Chris Mason
  2001-10-09 11:42                 ` Ed Tomlinson
  2001-10-10 21:28                 ` Ed Tomlinson
  0 siblings, 2 replies; 12+ messages in thread
From: Chris Mason @ 2001-10-09  2:29 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-lvm


On Monday, October 08, 2001 09:57:59 PM -0400 Ed Tomlinson <tomlins@CAM.ORG> wrote:

> Hi Chris
> 
>> Ok, that first patch won't quite fix it, as we can still oops
>> in lvm_snapshot_COW.  This one works better for me:
> 
> Looks like this one has problems too.  Here is the oops I get with it:
>>> EIP; d680b178 <[lvm-mod]_disable_snapshot+10/44>   <=====
> Trace; d680a50a <[lvm-mod]lvm_drop_snapshot+22/94>
> Trace; d680aafc <[lvm-mod]lvm_snapshot_COW+3b4/3f4>
> Trace; d680d3ca <[lvm-mod]lvm_name+8c2/e76>

Hmmm, this should be the same as the bug fixed by the second patch.  We know the snapshot hasn't been released yet when lvm_snapshot_COW is called, so _disable_snapshot should not oops.

We've got a write lock on the lv semaphore, so nobody else should be calling lvm_drop_snapshot on us.  I hate to ask, but are you sure you did an rmmod before the modules_install?

Regardless, we can fix the oops in _disable_snapshot, I just don't see how the locking allows it to happen.  This incremental fix should do it:

--- 0.21/drivers/md/lvm-snap.c Sat, 06 Oct 2001 00:07:22 -0400 
+++ 0.21(w)/drivers/md/lvm-snap.c Mon, 08 Oct 2001 22:39:54 -0400 
@@ -687,6 +694,10 @@
 
 static void _disable_snapshot(vg_t *vg, lv_t *lv) {
 	const char *err;
+	if (!lv->lv_block_exception) {
+		printk(KERN_ERR "%s -- snapshot already disabled\n", lvm_name);
+		return ;
+	}
 	lv->lv_block_exception[0].rsector_org = LVM_SNAPSHOT_DROPPED_SECTOR;
 	if(_write_COW_table_block(vg, lv, 0, &err) < 0) {
 		printk(KERN_ERR "%s -- couldn't disable snapshot: %s\n",

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-09  2:29               ` Chris Mason
@ 2001-10-09 11:42                 ` Ed Tomlinson
  2001-10-10 21:28                 ` Ed Tomlinson
  1 sibling, 0 replies; 12+ messages in thread
From: Ed Tomlinson @ 2001-10-09 11:42 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-lvm

On October 8, 2001 10:29 pm, Chris Mason wrote:

> We've got a write lock on the lv semaphore, so nobody else should be
> calling lvm_drop_snapshot on us.  I hate to ask, but are you sure you did
> an rmmod before the modules_install?

my root is lvm so the only way to reload the module is to boot.  In this case
I rebuilt from scratch (clean, dep, bzImage, modules, modules_install, stuff, 
lvmcreate_initrd <kernel>, lilo, reboot).  After I got the oops I rebooted
and rechecked the source...

> Regardless, we can fix the oops in _disable_snapshot, I just don't see how
> the locking allows it to happen.  This incremental fix should do it:

I will try this tonight.  Was off yesterday.

Thanks Chris
Ed

> --- 0.21/drivers/md/lvm-snap.c Sat, 06 Oct 2001 00:07:22 -0400
> +++ 0.21(w)/drivers/md/lvm-snap.c Mon, 08 Oct 2001 22:39:54 -0400
> @@ -687,6 +694,10 @@
>
>  static void _disable_snapshot(vg_t *vg, lv_t *lv) {
>  	const char *err;
> +	if (!lv->lv_block_exception) {
> +		printk(KERN_ERR "%s -- snapshot already disabled\n", lvm_name);
> +		return ;
> +	}
>  	lv->lv_block_exception[0].rsector_org = LVM_SNAPSHOT_DROPPED_SECTOR;
>  	if(_write_COW_table_block(vg, lv, 0, &err) < 0) {
>  		printk(KERN_ERR "%s -- couldn't disable snapshot: %s\n",

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-09  2:29               ` Chris Mason
  2001-10-09 11:42                 ` Ed Tomlinson
@ 2001-10-10 21:28                 ` Ed Tomlinson
  2001-10-10 23:21                   ` Chris Mason
  1 sibling, 1 reply; 12+ messages in thread
From: Ed Tomlinson @ 2001-10-10 21:28 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-lvm

On October 8, 2001 10:29 pm, Chris Mason wrote:
> On Monday, October 08, 2001 09:57:59 PM -0400 Ed Tomlinson <tomlins@CAM.ORG> wrote:
> > Hi Chris
> >
> >> Ok, that first patch won't quite fix it, as we can still oops
> >> in lvm_snapshot_COW.  This one works better for me:
> >
> > Looks like this one has problems too.  Here is the oops I get with it:
> >>> EIP; d680b178 <[lvm-mod]_disable_snapshot+10/44>   <=====
> >
> > Trace; d680a50a <[lvm-mod]lvm_drop_snapshot+22/94>
> > Trace; d680aafc <[lvm-mod]lvm_snapshot_COW+3b4/3f4>
> > Trace; d680d3ca <[lvm-mod]lvm_name+8c2/e76>
>
> Hmmm, this should be the same as the bug fixed by the second patch.  We
> know the snapshot hasn't been released yet when lvm_snapshot_COW is called,
> so _disable_snapshot should not oops.

Chris, with 2.4.11, your test vfs locking patch (manually fixing a reject) plus
the second snapshot full patch with the addon from this message I have not been
able to break things.  

If you are happy too, suggest you post the patches.

Thanks
Ed Tomlinson

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-10 21:28                 ` Ed Tomlinson
@ 2001-10-10 23:21                   ` Chris Mason
  2001-10-11  0:37                     ` Ed Tomlinson
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Mason @ 2001-10-10 23:21 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-lvm


On Wednesday, October 10, 2001 05:28:21 PM -0400 Ed Tomlinson <tomlins@CAM.ORG> wrote:

> Chris, with 2.4.11, your test vfs locking patch (manually fixing a reject) plus
> the second snapshot full patch with the addon from this message I have not been
> able to break things.  
> 
> If you are happy too, suggest you post the patches.

Good to hear Ed, many thanks.  Would you mind running one extra
set of tests?  I'm having problems with the system crashing/deadlocking
under high snapshot COW load.  10 concurrent writers onto the source
volume triggers it within 10-20 minutes.  I'm using stress.sh, 
but dbench should work too...I'm seeing this on pure rc4 (2.4.1[01],
and with my\x7f^\bzes).

Once this is worked out, I'll send everything in.

-chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-10 23:21                   ` Chris Mason
@ 2001-10-11  0:37                     ` Ed Tomlinson
  2001-10-11  1:27                       ` Chris Mason
  0 siblings, 1 reply; 12+ messages in thread
From: Ed Tomlinson @ 2001-10-11  0:37 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-lvm

On October 10, 2001 07:21 pm, Chris Mason wrote:
> On Wednesday, October 10, 2001 05:28:21 PM -0400 Ed Tomlinson <tomlins@CAM.ORG> wrote:
> > Chris, with 2.4.11, your test vfs locking patch (manually fixing a
> > reject) plus the second snapshot full patch with the addon from this
> > message I have not been able to break things.
> >
> > If you are happy too, suggest you post the patches.
>
> Good to hear Ed, many thanks.  Would you mind running one extra
> set of tests?  I'm having problems with the system crashing/deadlocking
> under high snapshot COW load.  10 concurrent writers onto the source
> volume triggers it within 10-20 minutes.  I'm using stress.sh,
> but dbench should work too...I'm seeing this on pure rc4 (2.4.1[01],
> and with my fixes).
>
> Once this is worked out, I'll send everything in.

I just finished running about 40 mins of debench filling 3 1G snapshots.
No stalls/opps.  I am creating the snapshot with

lvcreate -L1G -s -n snap /dev/lv/root /dev/hda3

/dev/lv/root lives on hde.  Does your snapshot live on the same physical
disk as its source disk?

Can you please send me the version of stress.sh you are using?  I'll see
if that triggers anything here.

Ed

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-lvm] Re: [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre)
  2001-10-11  0:37                     ` Ed Tomlinson
@ 2001-10-11  1:27                       ` Chris Mason
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Mason @ 2001-10-11  1:27 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 684 bytes --]



On Wednesday, October 10, 2001 08:37:31 PM -0400 Ed Tomlinson <tomlins@CAM.ORG> wrote:

> I just finished running about 40 mins of debench filling 3 1G snapshots.
> No stalls/opps.  I am creating the snapshot with
> 
> lvcreate -L1G -s -n snap /dev/lv/root /dev/hda3
> 
> /dev/lv/root lives on hde.  Does your snapshot live on the same physical
> disk as its source disk?

Yes, at least partly.

> 
> Can you please send me the version of stress.sh you are using?  I'll see
> if that triggers anything here.

Attached.  I run stress.sh -n 10 -c (some ~40MB dir) -s /mntpoint.
Each process makes a local copy of the working dir, and verifies the 
contents of the copy.

thanks,
Chris

[-- Attachment #2: stress.sh --]
[-- Type: application/x-sh, Size: 3126 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2001-10-11  1:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20011003230330.508E91104F@oscar.casa.dyndns.org>
     [not found] ` <20011008012906.D790A11069@oscar.casa.dyndns.org>
     [not found]   ` <77260000.1002507094@tiny>
2001-10-08 15:53     ` [linux-lvm] [OOPS] full snapshot (with test vfs locking patch for reiserfs snapshots in 11-pre) Ed Tomlinson
2001-10-08 16:54       ` [linux-lvm] " Chris Mason
2001-10-08 17:05         ` Ed Tomlinson
2001-10-08 17:40           ` Chris Mason
2001-10-08 19:51           ` Chris Mason
2001-10-09  1:57             ` Ed Tomlinson
2001-10-09  2:29               ` Chris Mason
2001-10-09 11:42                 ` Ed Tomlinson
2001-10-10 21:28                 ` Ed Tomlinson
2001-10-10 23:21                   ` Chris Mason
2001-10-11  0:37                     ` Ed Tomlinson
2001-10-11  1:27                       ` Chris Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).