* [linux-lvm] LVM 0.9 snapshot and ext3
@ 2000-11-29 6:09 Jay Weber
2000-11-29 6:21 ` Jay Weber
2000-11-29 6:24 ` Jay Weber
0 siblings, 2 replies; 10+ messages in thread
From: Jay Weber @ 2000-11-29 6:09 UTC (permalink / raw)
To: linux-lvm
Hi,
I'm getting an oops while using ext3 and lvm snapshot (in the case where
snapshot runs out of space). I'm thinking if the snapshot is out of space
it should disable and ext3 probably shouldn't oops since the journal is
being written to the active source and not the snapshot.
This is using lvm 0.9 and ext3 0.05b using the writeback data journaling
mode.
I've included the output below for any interested.
----
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with writeback data mode.
lvm -- giving up to snapshot /dev/vol/foo on /dev/vol/snap due out of
space
Bad lvm_map in ll_rw_block
Assertion failure in journal_write_metadata_buffer() at journal.c line
302: "!buffer_locked(bh_in)"
Unable to handle kernel NULL pointer dereference at virtual address
00000000
current->tss.cr3 = 00101000, %cr3 = 00101000
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0010:[<c0155540>]
EFLAGS: 00010282
eax: 00000067 ebx: 00000000 ecx: 000000d4 edx: c6da6000
esi: c3b85c60 edi: c3b85c60 ebp: 00000000 esp: c3ac7e64
ds: 0018 es: 0018 ss: 0018
Process kjournald (pid: 968, process nr: 60, stackpage=c3ac7000)
Stack: c01e2c20 0000012e c01e2c7a c3ac7fc4 c3b85c60 c7e3aac0 c3ac7fc8
00000000
c015851c c7e3aac0 c3b85c60 c3ac7fc4 00000427 c6161a40 c5d9c8c0
c3aed3c0
c5ae82a0 c5ae85a0 c5ae86a0 c6fb3560 c6fb3de0 c6fb3ee0 c10114a0
c7b89b20
Call Trace: [<c01e2c20>] [<c01e2c7a>] [<c015851c>] [<c0155341>]
[<c0155230>] [<c0108c5f>]
Code: c6 05 00 00 00 00 00 83 c4 14 89 f6 8b 7c 24 1c 8b 47 18 a8
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-29 6:09 [linux-lvm] LVM 0.9 snapshot and ext3 Jay Weber
@ 2000-11-29 6:21 ` Jay Weber
2000-11-29 6:24 ` Jay Weber
1 sibling, 0 replies; 10+ messages in thread
From: Jay Weber @ 2000-11-29 6:21 UTC (permalink / raw)
To: linux-lvm
Using ext2 I don't oops, so maybe it is an ext3 issue. I still do get the
Bad lvm_map in ll_rw_block though. I don't recall that in 0.8, but then
again maybe I never filled a snapshot's reserved space. :)
On Wed, 29 Nov 2000, Jay Weber wrote:
> Hi,
>
> I'm getting an oops while using ext3 and lvm snapshot (in the case where
> snapshot runs out of space). I'm thinking if the snapshot is out of space
> it should disable and ext3 probably shouldn't oops since the journal is
> being written to the active source and not the snapshot.
>
> This is using lvm 0.9 and ext3 0.05b using the writeback data journaling
> mode.
>
> I've included the output below for any interested.
>
> ----
> EXT3-fs: recovery complete.
> EXT3-fs: mounted filesystem with writeback data mode.
> lvm -- giving up to snapshot /dev/vol/foo on /dev/vol/snap due out of
> space
> Bad lvm_map in ll_rw_block
> Assertion failure in journal_write_metadata_buffer() at journal.c line
> 302: "!buffer_locked(bh_in)"
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000000
> current->tss.cr3 = 00101000, %cr3 = 00101000
> *pde = 00000000
> Oops: 0002
> CPU: 0
> EIP: 0010:[<c0155540>]
> EFLAGS: 00010282
> eax: 00000067 ebx: 00000000 ecx: 000000d4 edx: c6da6000
> esi: c3b85c60 edi: c3b85c60 ebp: 00000000 esp: c3ac7e64
> ds: 0018 es: 0018 ss: 0018
> Process kjournald (pid: 968, process nr: 60, stackpage=c3ac7000)
> Stack: c01e2c20 0000012e c01e2c7a c3ac7fc4 c3b85c60 c7e3aac0 c3ac7fc8
> 00000000
> c015851c c7e3aac0 c3b85c60 c3ac7fc4 00000427 c6161a40 c5d9c8c0
> c3aed3c0
> c5ae82a0 c5ae85a0 c5ae86a0 c6fb3560 c6fb3de0 c6fb3ee0 c10114a0
> c7b89b20
> Call Trace: [<c01e2c20>] [<c01e2c7a>] [<c015851c>] [<c0155341>]
> [<c0155230>] [<c0108c5f>]
> Code: c6 05 00 00 00 00 00 83 c4 14 89 f6 8b 7c 24 1c 8b 47 18 a8
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-29 6:09 [linux-lvm] LVM 0.9 snapshot and ext3 Jay Weber
2000-11-29 6:21 ` Jay Weber
@ 2000-11-29 6:24 ` Jay Weber
2000-11-29 7:01 ` Andreas Dilger
1 sibling, 1 reply; 10+ messages in thread
From: Jay Weber @ 2000-11-29 6:24 UTC (permalink / raw)
To: linux-lvm
Proper OOPS output below, sorry, pasted useless one in prior message.
Nov 28 22:12:52 slippey kernel: Bad lvm_map in ll_rw_block
Nov 28 22:12:52 slippey kernel: Assertion failure in
journal_write_metadata_buffer() at journal.c line
302: "!buffer_locked(bh_in)"
Nov 28 22:12:52 slippey kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
Nov 28 22:12:52 slippey kernel: current->tss.cr3 = 00101000, %cr3 =
00101000
Nov 28 22:12:52 slippey kernel: *pde = 00000000
Nov 28 22:12:52 slippey kernel: Oops: 0002
Nov 28 22:12:52 slippey kernel: CPU: 0
Nov 28 22:12:52 slippey
kernel: EIP: 0010:[journal_write_metadata_buffer+60/436]
Nov 28 22:12:52 slippey kernel: EFLAGS: 00010282
Nov 28 22:12:52 slippey kernel: eax: 00000067 ebx: 00000000
ecx: 000000d4 edx: c6da6000
Nov 28 22:12:52 slippey kernel: esi: c3b85c60 edi: c3b85c60
ebp: 00000000 esp: c3ac7e64
Nov 28 22:12:53 slippey kernel: ds: 0018 es: 0018 ss: 0018
Nov 28 22:12:53 slippey kernel: Process kjournald (pid: 968, process
nr: 60, stackpage=c3ac7000)
Nov 28 22:12:53 slippey kernel: Stack: c01e2c20 0000012e c01e2c7a c3ac7fc4
c3b85c60 c7e3aac0 c3ac7fc8 00000000
Nov 28 22:12:53 slippey kernel: c015851c c7e3aac0 c3b85c60 c3ac7fc4
00000427 c6161a40 c5d9c8c0 c3aed3c0
Nov 28 22:12:53 slippey kernel: c5ae82a0 c5ae85a0 c5ae86a0 c6fb3560
c6fb3de0 c6fb3ee0 c10114a0 c7b89b20
Nov 28 22:12:53 slippey kernel: Call Trace: [cprt+21728/40165]
[cprt+21818/40165] [journal_commit_transaction+1356/3588]
[kjournald+257/424] [commit_timeout+0/12] [kernel_thread+35/48]
Nov 28 22:12:53 slippey kernel: Code: c6 05 00 00 00 00 00 83 c4 14 89 f6
8b 7c 24 1c 8b 47 18 a8
On Wed, 29 Nov 2000, Jay Weber wrote:
> Hi,
>
> I'm getting an oops while using ext3 and lvm snapshot (in the case where
> snapshot runs out of space). I'm thinking if the snapshot is out of space
> it should disable and ext3 probably shouldn't oops since the journal is
> being written to the active source and not the snapshot.
>
> This is using lvm 0.9 and ext3 0.05b using the writeback data journaling
> mode.
>
> I've included the output below for any interested.
>
> ----
> EXT3-fs: recovery complete.
> EXT3-fs: mounted filesystem with writeback data mode.
> lvm -- giving up to snapshot /dev/vol/foo on /dev/vol/snap due out of
> space
> Bad lvm_map in ll_rw_block
> Assertion failure in journal_write_metadata_buffer() at journal.c line
> 302: "!buffer_locked(bh_in)"
> Unable to handle kernel NULL pointer dereference at virtual address
> 00000000
> current->tss.cr3 = 00101000, %cr3 = 00101000
> *pde = 00000000
> Oops: 0002
> CPU: 0
> EIP: 0010:[<c0155540>]
> EFLAGS: 00010282
> eax: 00000067 ebx: 00000000 ecx: 000000d4 edx: c6da6000
> esi: c3b85c60 edi: c3b85c60 ebp: 00000000 esp: c3ac7e64
> ds: 0018 es: 0018 ss: 0018
> Process kjournald (pid: 968, process nr: 60, stackpage=c3ac7000)
> Stack: c01e2c20 0000012e c01e2c7a c3ac7fc4 c3b85c60 c7e3aac0 c3ac7fc8
> 00000000
> c015851c c7e3aac0 c3b85c60 c3ac7fc4 00000427 c6161a40 c5d9c8c0
> c3aed3c0
> c5ae82a0 c5ae85a0 c5ae86a0 c6fb3560 c6fb3de0 c6fb3ee0 c10114a0
> c7b89b20
> Call Trace: [<c01e2c20>] [<c01e2c7a>] [<c015851c>] [<c0155341>]
> [<c0155230>] [<c0108c5f>]
> Code: c6 05 00 00 00 00 00 83 c4 14 89 f6 8b 7c 24 1c 8b 47 18 a8
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-29 6:24 ` Jay Weber
@ 2000-11-29 7:01 ` Andreas Dilger
2000-11-29 10:18 ` Andreas Dilger
0 siblings, 1 reply; 10+ messages in thread
From: Andreas Dilger @ 2000-11-29 7:01 UTC (permalink / raw)
To: linux-lvm; +Cc: jweber, Stephen C. Tweedie
Jay Weber writes:
> Proper OOPS output below, sorry, pasted useless one in prior message.
Basically, the oops is from an assertion (debugging check) in the ext3
journal code testing for a condition that _shouldn't_ happen. In this case
"!buffer_locked(bh_in)", so it is getting a locked buffer (in the middle of
I/O) when it doesn't expect it. The ext3 0.0.5b code (which you are using)
should have been able to handle basic I/O failures and such, but this
looks a bit unusual.
Maybe LVM isn't cleaning up the buffer properly when it runs out of space
in the snapshot... Also, I haven't really dug into LVM snapshots much,
but shouldn't all the remapping be done on the read-only copy? Do LVM
snapshots store the modified blocks in the original LV, and the old (frozen)
blocks in the snapshot LV, or vice versa? If ext3 is writing to the
original LV, it should never run out of space. At most, the read-only
copy (which will not be ext3) should disappear - I'm not sure what LVM
does in this case.
Note also that 0.0.5b with writeback mode is still fairly untested,
so there is some chance it is an ext3 issue. Stephen previously wrote
that data=ordered may actually be faster than data=writeback.
I've CC'd this to Stephen in case he has any ideas.
Cheers, Andreas
> Nov 28 22:12:52 slippey kernel: Bad lvm_map in ll_rw_block
> Nov 28 22:12:52 slippey kernel: Assertion failure in
> journal_write_metadata_buffer() at journal.c line
> 302: "!buffer_locked(bh_in)"
> Nov 28 22:12:52 slippey kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 00000000
> Nov 28 22:12:52 slippey kernel: current->tss.cr3 = 00101000, %cr3 =
> 00101000
> Nov 28 22:12:52 slippey kernel: *pde = 00000000
> Nov 28 22:12:52 slippey kernel: Oops: 0002
> Nov 28 22:12:52 slippey kernel: CPU: 0
> Nov 28 22:12:52 slippey
> kernel: EIP: 0010:[journal_write_metadata_buffer+60/436]
> Nov 28 22:12:52 slippey kernel: EFLAGS: 00010282
> Nov 28 22:12:52 slippey kernel: eax: 00000067 ebx: 00000000
> ecx: 000000d4 edx: c6da6000
> Nov 28 22:12:52 slippey kernel: esi: c3b85c60 edi: c3b85c60
> ebp: 00000000 esp: c3ac7e64
> Nov 28 22:12:53 slippey kernel: ds: 0018 es: 0018 ss: 0018
> Nov 28 22:12:53 slippey kernel: Process kjournald (pid: 968, process
> nr: 60, stackpage=c3ac7000)
> Nov 28 22:12:53 slippey kernel: Stack: c01e2c20 0000012e c01e2c7a c3ac7fc4
> c3b85c60 c7e3aac0 c3ac7fc8 00000000
> Nov 28 22:12:53 slippey kernel: c015851c c7e3aac0 c3b85c60 c3ac7fc4
> 00000427 c6161a40 c5d9c8c0 c3aed3c0
> Nov 28 22:12:53 slippey kernel: c5ae82a0 c5ae85a0 c5ae86a0 c6fb3560
> c6fb3de0 c6fb3ee0 c10114a0 c7b89b20
> Nov 28 22:12:53 slippey kernel: Call Trace: [cprt+21728/40165]
> [cprt+21818/40165] [journal_commit_transaction+1356/3588]
> [kjournald+257/424] [commit_timeout+0/12] [kernel_thread+35/48]
> Nov 28 22:12:53 slippey kernel: Code: c6 05 00 00 00 00 00 83 c4 14 89 f6
> 8b 7c 24 1c 8b 47 18 a8
>
>
> On Wed, 29 Nov 2000, Jay Weber wrote:
>
> > Hi,
> >
> > I'm getting an oops while using ext3 and lvm snapshot (in the case where
> > snapshot runs out of space). I'm thinking if the snapshot is out of space
> > it should disable and ext3 probably shouldn't oops since the journal is
> > being written to the active source and not the snapshot.
> >
> > This is using lvm 0.9 and ext3 0.05b using the writeback data journaling
> > mode.
> >
> > I've included the output below for any interested.
> >
> > ----
> > EXT3-fs: recovery complete.
> > EXT3-fs: mounted filesystem with writeback data mode.
> > lvm -- giving up to snapshot /dev/vol/foo on /dev/vol/snap due out of
> > space
> > Bad lvm_map in ll_rw_block
> > Assertion failure in journal_write_metadata_buffer() at journal.c line
> > 302: "!buffer_locked(bh_in)"
> > Unable to handle kernel NULL pointer dereference at virtual address
> > 00000000
> > current->tss.cr3 = 00101000, %cr3 = 00101000
> > *pde = 00000000
> > Oops: 0002
> > CPU: 0
> > EIP: 0010:[<c0155540>]
> > EFLAGS: 00010282
> > eax: 00000067 ebx: 00000000 ecx: 000000d4 edx: c6da6000
> > esi: c3b85c60 edi: c3b85c60 ebp: 00000000 esp: c3ac7e64
> > ds: 0018 es: 0018 ss: 0018
> > Process kjournald (pid: 968, process nr: 60, stackpage=c3ac7000)
> > Stack: c01e2c20 0000012e c01e2c7a c3ac7fc4 c3b85c60 c7e3aac0 c3ac7fc8
> > 00000000
> > c015851c c7e3aac0 c3b85c60 c3ac7fc4 00000427 c6161a40 c5d9c8c0
> > c3aed3c0
> > c5ae82a0 c5ae85a0 c5ae86a0 c6fb3560 c6fb3de0 c6fb3ee0 c10114a0
> > c7b89b20
> > Call Trace: [<c01e2c20>] [<c01e2c7a>] [<c015851c>] [<c0155341>]
> > [<c0155230>] [<c0108c5f>]
> > Code: c6 05 00 00 00 00 00 83 c4 14 89 f6 8b 7c 24 1c 8b 47 18 a8
> >
> > _______________________________________________
> > linux-lvm mailing list
> > linux-lvm@sistina.com
> > http://lists.sistina.com/mailman/listinfo/linux-lvm
> >
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
>
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-29 7:01 ` Andreas Dilger
@ 2000-11-29 10:18 ` Andreas Dilger
2000-11-30 9:46 ` Stephen C. Tweedie
0 siblings, 1 reply; 10+ messages in thread
From: Andreas Dilger @ 2000-11-29 10:18 UTC (permalink / raw)
To: linux-lvm; +Cc: jweber, Stephen C. Tweedie, Heinz J. Mauelshagen
I previously wrote:
> Basically, the oops is from an assertion (debugging check) in the ext3
> journal code testing for a condition that _shouldn't_ happen. In this case
> "!buffer_locked(bh_in)", so it is getting a locked buffer (in the middle of
> I/O) when it doesn't expect it. The ext3 0.0.5b code (which you are using)
> should have been able to handle basic I/O failures and such, but this
> looks a bit unusual.
>
> Maybe LVM isn't cleaning up the buffer properly when it runs out of space
> in the snapshot...
Judging by the assertion (the buffer is still locked), and the fact that
this flag BH_Lock is what's set when a buffer is being written, and
cleared by wait_on_buffer(), LVM may need to be waiting on it's snapshot
I/Os to complete if it will later destroy the snapshot...
Looking at lvm_snapshot_COW(), there are a few places it may need to wait -
after the kiovec READ before we start the WRITE (so that the data in the
blocks is correct)... Since the WRITE is to blocks _outside_ where ext3
is writing them, it may well be that the READ is still going on for the
blocks in the ext3 (original) LV (and are locked) at the time the snapshot
LV is full, and hence ext3 oops (speculation here).
> Also, I haven't really dug into LVM snapshots much,
> but shouldn't all the remapping be done on the read-only copy? Do LVM
> snapshots store the modified blocks in the original LV, and the old (frozen)
> blocks in the snapshot LV, or vice versa? If ext3 is writing to the
> original LV, it should never run out of space. At most, the read-only
> copy (which will not be ext3) should disappear - I'm not sure what LVM
> does in this case.
As an aside, I had trouble reading the lvm_map() function because of the
many nested if cases and the many uses of {rsector,rdev}_tmp, so I have
re-done it and attached a patch. It _should_ be 100% identical to the
existing code operation (mostly reversing tests and early exits). We
now have:
{rsector,rdev}_org = original values
{rsector,rdev}_map = mapped values
{rsector,rdev}_tmp = junk values only used to see if chunk already mapped
Some questionalble areas:
- the old code saves {rsector,rdev}_tmp before lvm_snapshot_remap_block(),
but these are only changed if the blocks are already mapped (ret = 1, a
case we don't care about), so no need to keep these for anything, right?
- calling lvm_snapshot_COW() you pass rsector_tmp and rsector_sav for the
org_phys_sector and org_virt_sector, yet they should always be identical
if we get this far (per above), since lvm_snapshot_remap_block() will not
have changed rsector_tmp (org_sector) if ret=0
Cheers, Andreas
===========================================================================
--- lvm.c.orig Sun Nov 19 20:35:44 2000
+++ lvm.c Wed Nov 29 02:30:06 2000
@@ -1601,10 +1601,10 @@
ulong index;
ulong pe_start;
ulong size = bh->b_size >> 9;
- ulong rsector_tmp = bh->b_blocknr * size;
- ulong rsector_sav;
- kdev_t rdev_tmp = bh->b_dev;
- kdev_t rdev_sav;
+ ulong rsector_org = bh->b_blocknr * size;
+ ulong rsector_map;
+ kdev_t rdev_org = bh->b_dev;
+ kdev_t rdev_map;
vg_t *vg_this = vg[VG_BLK(minor)];
lv_t *lv = vg_this->lv[LV_BLK(minor)];
@@ -1619,91 +1619,66 @@
if ((rw == WRITE || rw == WRITEA) &&
!(lv->lv_access & LV_WRITE)) {
printk(KERN_CRIT
- "%s - lvm_map: ll_rw_blk write for readonly LV %s\n",
+ "%s - lvm_map: ll_rw_blk write for readonly LV %s\n",
lvm_name, lv->lv_name);
return -1;
}
#ifdef DEBUG_MAP
printk(KERN_DEBUG
- "%s - lvm_map minor:%d *rdev: %02d:%02d *rsector: %lu "
- "size:%lu\n",
- lvm_name, minor,
- MAJOR(rdev_tmp),
- MINOR(rdev_tmp),
- rsector_tmp, size);
+ "%s - lvm_map minor:%d *rdev: %s *rsector: %lu size:%lu\n",
+ lvm_name, minor, kdevname(rdev_org), rsector_org, size);
#endif
- if (rsector_tmp + size > lv->lv_size) {
+ if (rsector_org + size > lv->lv_size) {
printk(KERN_ALERT
- "%s - lvm_map access beyond end of device; *rsector: "
- "%lu or size: %lu wrong for minor: %2d\n",
- lvm_name, rsector_tmp, size, minor);
+ "%s - lvm_map access beyond end of device; *rsector: %lu"
+ " or lv_size: %lu wrong for minor: %2d\n",
+ lvm_name, rsector_org, size, minor);
return -1;
}
- rsector_sav = rsector_tmp;
- rdev_sav = rdev_tmp;
lvm_second_remap:
- /* linear mapping */
- if (lv->lv_stripes < 2) {
+ if (lv->lv_stripes < 2) { /* linear mapping */
/* get the index */
- index = rsector_tmp / vg_this->pe_size;
+ index = rsector_org / vg_this->pe_size;
pe_start = lv->lv_current_pe[index].pe;
- rsector_tmp = lv->lv_current_pe[index].pe +
- (rsector_tmp % vg_this->pe_size);
- rdev_tmp = lv->lv_current_pe[index].dev;
+ rsector_map = lv->lv_current_pe[index].pe +
+ (rsector_org % vg_this->pe_size);
+ rdev_map = lv->lv_current_pe[index].dev;
-#ifdef DEBUG_MAP
- printk(KERN_DEBUG
- "lv_current_pe[%ld].pe: %ld rdev: %02d:%02d rsector:%ld\n",
- index,
- lv->lv_current_pe[index].pe,
- MAJOR(rdev_tmp),
- MINOR(rdev_tmp),
- rsector_tmp);
-#endif
-
- /* striped mapping */
- } else {
+ } else { /* striped mapping */
ulong stripe_index;
ulong stripe_length;
stripe_length = vg_this->pe_size * lv->lv_stripes;
- stripe_index = (rsector_tmp % stripe_length) / lv->lv_stripesize;
- index = rsector_tmp / stripe_length +
+ stripe_index = (rsector_org % stripe_length) /lv->lv_stripesize;
+ index = rsector_org / stripe_length +
(stripe_index % lv->lv_stripes) *
(lv->lv_allocated_le / lv->lv_stripes);
pe_start = lv->lv_current_pe[index].pe;
- rsector_tmp = lv->lv_current_pe[index].pe +
- (rsector_tmp % stripe_length) -
+ rsector_map = lv->lv_current_pe[index].pe +
+ (rsector_org % stripe_length) -
(stripe_index % lv->lv_stripes) * lv->lv_stripesize -
stripe_index / lv->lv_stripes *
(lv->lv_stripes - 1) * lv->lv_stripesize;
- rdev_tmp = lv->lv_current_pe[index].dev;
+ rdev_map = lv->lv_current_pe[index].dev;
}
#ifdef DEBUG_MAP
printk(KERN_DEBUG
- "lv_current_pe[%ld].pe: %ld rdev: %02d:%02d rsector:%ld\n"
+ "lv_current_pe[%ld].pe: %ld rdev: %s rsector:%ld\n"
"stripe_length: %ld stripe_index: %ld\n",
- index,
- lv->lv_current_pe[index].pe,
- MAJOR(rdev_tmp),
- MINOR(rdev_tmp),
- rsector_tmp,
- stripe_length,
- stripe_index);
+ index, lv->lv_current_pe[index].pe, kdevname(rdev_map),
+ rsector_map, stripe_length, stripe_index);
#endif
/* handle physical extents on the move */
if (pe_lock_req.lock == LOCK_PE) {
- if (rdev_tmp == pe_lock_req.data.pv_dev &&
- rsector_tmp >= pe_lock_req.data.pv_offset &&
- rsector_tmp < (pe_lock_req.data.pv_offset +
+ if (rdev_map == pe_lock_req.data.pv_dev &&
+ rsector_map >= pe_lock_req.data.pv_offset &&
+ rsector_map < (pe_lock_req.data.pv_offset +
vg_this->pe_size)) {
sleep_on(&lvm_map_wait);
- rsector_tmp = rsector_sav;
- rdev_tmp = rdev_sav;
goto lvm_second_remap;
}
}
@@ -1713,54 +1688,53 @@
else
lv->lv_current_pe[index].reads++;
- /* snapshot volume exception handling on physical device address base */
- if (lv->lv_access & (LV_SNAPSHOT|LV_SNAPSHOT_ORG)) {
- /* original logical volume */
- if (lv->lv_access & LV_SNAPSHOT_ORG) {
- if (rw == WRITE || rw == WRITEA)
- {
- lv_t *lv_ptr;
-
- /* start with first snapshot and loop thrugh all of them */
- for (lv_ptr = lv->lv_snapshot_next;
- lv_ptr != NULL;
- lv_ptr = lv_ptr->lv_snapshot_next) {
- /* Check for inactive snapshot */
- if (!(lv_ptr->lv_status & LV_ACTIVE)) continue;
- down(&lv->lv_snapshot_org->lv_snapshot_sem);
- /* do we still have exception storage for this snapshot free? */
- if (lv_ptr->lv_block_exception != NULL) {
- rdev_sav = rdev_tmp;
- rsector_sav = rsector_tmp;
- if (!lvm_snapshot_remap_block(&rdev_tmp,
- &rsector_tmp,
- pe_start,
- lv_ptr)) {
- /* create a new mapping */
- if (!(ret = lvm_snapshot_COW(rdev_tmp,
- rsector_tmp,
- pe_start,
- rsector_sav,
- lv_ptr)))
- ret = lvm_write_COW_table_block(vg_this,
- lv_ptr);
- }
- rdev_tmp = rdev_sav;
- rsector_tmp = rsector_sav;
- }
- up(&lv->lv_snapshot_org->lv_snapshot_sem);
- }
- }
- } else {
- /* remap snapshot logical volume */
- down(&lv->lv_snapshot_sem);
- if (lv->lv_block_exception != NULL)
- lvm_snapshot_remap_block(&rdev_tmp, &rsector_tmp, pe_start, lv);
- up(&lv->lv_snapshot_sem);
+ /* if not a snapshot volume, no need to do any more remapping */
+ if (!(lv->lv_access & (LV_SNAPSHOT|LV_SNAPSHOT_ORG)))
+ goto done;
+
+ if (lv->lv_access & LV_SNAPSHOT_ORG) { /* original logical volume */
+ lv_t *lv_ptr;
+
+ /* we don't need remapping if we aren't changing the block */
+ if (rw != WRITE && rw != WRITEA)
+ goto done;
+
+ /* start with first snapshot, loop through them all */
+ down(&lv->lv_snapshot_org->lv_snapshot_sem);
+ for (lv_ptr = lv->lv_snapshot_next; lv_ptr != NULL;
+ lv_ptr = lv_ptr->lv_snapshot_next) {
+ unsigned long rsector_tmp = rsector_map;
+ kdev_t rdev_tmp = rdev_map;
+ /* is this snapshot inactive? */
+ if (!(lv_ptr->lv_status & LV_ACTIVE))
+ continue;
+ /* are we out of space for this snapshot? */
+ if (lv_ptr->lv_block_exception == NULL)
+ continue;
+ /* is this chunk already mapped in this snapshot? */
+ if (lvm_snapshot_remap_block(&rdev_tmp, &rsector_tmp,
+ pe_start, lv_ptr))
+ continue;
+ /* create a new mapping for this chunk */
+ if (!(ret = lvm_snapshot_COW(rdev_map, rsector_map,
+ pe_start, rsector_map,
+ lv_ptr)))
+ ret = lvm_write_COW_table_block(vg_this,lv_ptr);
}
+ up(&lv->lv_snapshot_org->lv_snapshot_sem);
+ } else { /* snapshot logical volume */
+ /* remap to snapshot logical volume if a copy of this chunk
+ * exists in the snapshot, otherwise use original chunk */
+ down(&lv->lv_snapshot_sem);
+ if (lv->lv_block_exception != NULL)
+ lvm_snapshot_remap_block(&rdev_map, &rsector_map,
+ pe_start, lv);
+ up(&lv->lv_snapshot_sem);
}
- bh->b_rdev = rdev_tmp;
- bh->b_rsector = rsector_tmp;
+
+done:
+ bh->b_rdev = rdev_map;
+ bh->b_rsector = rsector_map;
return ret;
} /* lvm_map() */
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-29 10:18 ` Andreas Dilger
@ 2000-11-30 9:46 ` Stephen C. Tweedie
2000-11-30 19:48 ` Andreas Dilger
0 siblings, 1 reply; 10+ messages in thread
From: Stephen C. Tweedie @ 2000-11-30 9:46 UTC (permalink / raw)
To: Andreas Dilger
Cc: linux-lvm, jweber, Stephen C. Tweedie, Heinz J. Mauelshagen
Hi,
On Wed, Nov 29, 2000 at 03:18:06AM -0700, Andreas Dilger wrote:
> > Maybe LVM isn't cleaning up the buffer properly when it runs out of space
> > in the snapshot...
>
> Judging by the assertion (the buffer is still locked), and the fact that
> this flag BH_Lock is what's set when a buffer is being written, and
> cleared by wait_on_buffer(), LVM may need to be waiting on it's snapshot
> I/Os to complete if it will later destroy the snapshot...
Right. There are only two ways the assertion can fail: the buffer is
locked for read, or for write. By this stage, ext3 has frozen all of
the filesystem operations which were working on this buffer, but may
have started collecting new updates while the old ones are committing,
so there is the possibility of ext3 doing concurrent access while we
are doing this journal operation.
So, is somebody else doing a read? It could be ext3, conceivably, if
we had had an IO error previously on the buffer and the BH_uptodate
flag had been cleared: a subsequent access to the buffer by another
process would try to reread the buffer off disk, locking it
temporarily. Could you load the ext3 kdb debugging extensions and run
a "bh <buffer_head address>" on the buffer-head once the oops drops
you into kdb monitor mode? That way, we can find out whether the
dirty bit has been set on the buffer_head.
Cheers,
Stephen
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-30 9:46 ` Stephen C. Tweedie
@ 2000-11-30 19:48 ` Andreas Dilger
2000-11-30 20:16 ` Andreas Dilger
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Andreas Dilger @ 2000-11-30 19:48 UTC (permalink / raw)
To: Stephen C. Tweedie
Cc: Andreas Dilger, linux-lvm, jweber, Heinz J. Mauelshagen
Stephen writes:
> So, is somebody else doing a read? It could be ext3, conceivably, if
> we had had an IO error previously on the buffer and the BH_uptodate
> flag had been cleared: a subsequent access to the buffer by another
> process would try to reread the buffer off disk, locking it
> temporarily.
Considering that there was the "Bad lvm_map in ll_rw_block" message just
before the oops, this would lead to the situation you are talking about.
In ll_rw_block the error path at "sorry:" clears BH_Dirty and BH_Uptodate,
and calls b_end_io() but does not necessarily unlock the buffer?
Looking at the code more closely, I see that I may have introduced a bug
here, because I didn't understand what was going on:
- in lvm_map() we call lvm_snapshot_COW() and lvm_write_COW_table(), set "ret"
- we return the last "ret" value to ll_rw_block()
- ll_rw_block() fails if the lvm_map() return value is non-zero
Really, the lvm_map() call should NOT return an error if the snapshot write
fails, because it needs to write the primary LV in this case. The
snapshot LV may fill up, and then be removed, but this should not affect
the I/O on the primary LV. There should only be an error if the primary
LV mapping fails, which will only happen on an inactive or R/O LV, or if
we try to write past the end of the LV.
*** Stephen, in case of ANY error inside lvm_map() should it unlock the buffer?
What do the low-level drivers do on an error? ll_rw_block() will already
call b_end_io for us at "sorry:", but it still clearly causes a problem for
ext3. This bug with the snapshots has only brought forth the *real* issue
of what to in case any of the other errors in lvm_map() happens.
The simple fix for this problem is to remove "ret" from lvm_map, and always
returning "0" at the end. The first patch does this. The second part cleans
up a few other error return values, to follow the normal kernel "-ve is error"
standard, and remove a bunch of needless gotos. It is based on my reorg
of lvm_map, so if the first patch (the one that should "hide" this bug)
fails for you, just manually delete the 4 cases of "ret" in that function.
Cheers, Andreas
===========================================================================
--- lvm.c.orig Sun Nov 19 20:35:44 2000
+++ lvm.c Thu Nov 30 12:17:17 2000
@@ -1597,7 +1597,6 @@
static int lvm_map(struct buffer_head *bh, int rw)
{
int minor = MINOR(bh->b_dev);
- int ret = 0;
ulong index;
ulong pe_start;
ulong size = bh->b_size >> 9;
@@ -1715,11 +1716,15 @@
if (lvm_snapshot_remap_block(&rdev_tmp, &rsector_tmp,
pe_start, lv_ptr))
continue;
- /* create a new mapping */
- if (!(ret = lvm_snapshot_COW(rdev_map, rsector_map,
- pe_start, rsector_map,
- lv_ptr)))
- ret = lvm_write_COW_table_block(vg_this,lv_ptr);
+ /*
+ * Create a new mapping for this chunk. If it fails,
+ * it will remove the snapshot, but this should not
+ * return the error to ll_rw_block(), which would stop
+ * the I/O on the primary LV copy.
+ */
+ if (!lvm_snapshot_COW(rdev_map, rsector_map,
+ pe_start, rsector_map, lv_ptr))
+ (void)lvm_write_COW_table_block(vg_this,lv_ptr);
}
up(&lv->lv_snapshot_org->lv_snapshot_sem);
} else { /* snapshot logical volume */
@@ -1740,7 +1740,7 @@
bh->b_rdev = rdev_map;
bh->b_rsector = rsector_map;
- return ret;
+ return 0;
} /* lvm_map() */
===========================================================================
--- lvm.c.orig Sun Nov 19 20:35:44 2000
+++ lvm.c Thu Nov 30 12:17:17 2000
@@ -2708,7 +2685,7 @@
up(&lv_ptr->lv_snapshot_org->lv_snapshot_sem);
vfree(lvbe_old);
vfree(lvs_hash_table_old);
- return 1;
+ return -ENOMEM;
}
for (e = 0; e < lv_ptr->lv_remap_ptr; e++)
--- lvm-snap.c.orig Tue Nov 14 05:52:45 2000
+++ lvm-snap.c Thu Nov 30 11:52:57 2000
@@ -295,7 +305,10 @@
if (brw_kiovec(WRITE, 1, &iobuf, snap_phys_dev,
blocks, blksize_snap, 0) != blksize_snap)
#endif
- goto fail_raw_write;
+ {
+ reason = "write error";
+ goto error;
+ }
/* initialization of next COW exception table block with zeroes */
@@ -323,7 +336,10 @@
if (brw_kiovec(WRITE, 1, &iobuf, snap_phys_dev,
blocks, blksize_snap, 0) != blksize_snap)
#endif
- goto fail_raw_write;
+ {
+ reason = "write error";
+ goto error;
+ }
}
@@ -334,13 +350,9 @@
return 0;
/* slow path */
- out:
+ error:
lvm_drop_snapshot(lv_snap, reason);
- return 1;
+ return -1;
-
- fail_raw_write:
- reason = "write error";
- goto out;
}
/*
@@ -366,8 +378,10 @@
int max_sectors, nr_sectors;
/* check if we are out of snapshot space */
- if (idx >= lv_snap->lv_remap_end)
- goto fail_out_of_space;
+ if (idx >= lv_snap->lv_remap_end) {
+ reason = "out of space";
+ goto error;
+ }
/* calculate physical boundaries of source chunk */
pe_off = org_pe_start % chunk_size;
@@ -401,8 +415,10 @@
min_blksize = min(blksize_org, blksize_snap);
max_sectors = KIO_MAX_SECTORS * (min_blksize>>9);
- if (chunk_size % (max_blksize>>9))
- goto fail_blksize;
+ if (chunk_size % (max_blksize>>9)) {
+ reason = "block size error";
+ goto error;
+ }
while (chunk_size)
{
@@ -420,7 +436,10 @@
if (brw_kiovec(READ, 1, &iobuf, org_phys_dev,
blocks, blksize_org, 0) != (nr_sectors<<9))
#endif
- goto fail_raw_read;
+ {
+ reason = "read error";
+ goto error;
+ }
lvm_snapshot_prepare_blocks(blocks, snap_start,
nr_sectors, blksize_snap);
@@ -431,7 +450,10 @@
if (brw_kiovec(WRITE, 1, &iobuf, snap_phys_dev,
blocks, blksize_snap, 0) != (nr_sectors<<9))
#endif
- goto fail_raw_write;
+ {
+ reason = "write error";
+ goto error;
+ }
}
#ifdef DEBUG_SNAPSHOT
@@ -455,22 +477,9 @@
return 0;
/* slow path */
- out:
+ error:
lvm_drop_snapshot(lv_snap, reason);
- return 1;
+ return -1;
-
- fail_out_of_space:
- reason = "out of space";
- goto out;
- fail_raw_read:
- reason = "read error";
- goto out;
- fail_raw_write:
- reason = "write error";
- goto out;
- fail_blksize:
- reason = "blocksize error";
- goto out;
}
int lvm_snapshot_alloc_iobuf_pages(struct kiobuf * iobuf, int sectors)
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-30 19:48 ` Andreas Dilger
@ 2000-11-30 20:16 ` Andreas Dilger
2000-11-30 22:04 ` Jay Weber
2000-12-01 15:06 ` Stephen C. Tweedie
2 siblings, 0 replies; 10+ messages in thread
From: Andreas Dilger @ 2000-11-30 20:16 UTC (permalink / raw)
To: Andreas Dilger; +Cc: Linux LVM mailing list, jweber
I wrote:
> The simple fix for this problem is to remove "ret" from lvm_map, and always
> returning "0" at the end.
Strangely, in 2.4 lvm_make_request() we totally ignore the return code from
lvm_map() and always return 1. This should also be fixed after lvm_map()
no longer returns the snapshot error codes. There is currently a thread
on l-k (cc'd to linux-lvm) which is discussing this very issue. A good
time to clean this whole thing up.
Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-30 19:48 ` Andreas Dilger
2000-11-30 20:16 ` Andreas Dilger
@ 2000-11-30 22:04 ` Jay Weber
2000-12-01 15:06 ` Stephen C. Tweedie
2 siblings, 0 replies; 10+ messages in thread
From: Jay Weber @ 2000-11-30 22:04 UTC (permalink / raw)
To: Andreas Dilger; +Cc: linux-lvm, Heinz J. Mauelshagen
Okay, got my kdb fixed up and applied the patch but I've no idea if it
solves the issue, because I now have the OTHER problem I ran into prior
but didn't finish looking into yet.
It's the issue where when I do a vgchange -ay I get an oops, or in this
case I'm dropping to kdb. I can not get my volume active again at this
point to see if the snapshotting problem exists yet. :)
Oh, I'm starting to think I'm running into the vgchange bug as a result of
the snapshot out of space bug? I was unable to recreate it prior, but I
just did now.
I can recreate this at will now on my laptop, so if there's any registers
or such you'd like me to dig up, I'll be glad to dig into them via kdb for
you. :)
I'm attaching the kdb/oops output below if you'd care to take a look.
----
Nov 30 14:01:07 slippey kernel: Unable to handle kernel NULL pointer
dereference at virtual address 000001b0
Nov 30 14:01:07 slippey kernel: current->tss.cr3 = 06183000, %cr3 =
06183000
Nov 30 14:01:07 slippey kernel: *pde = 06180067
Nov 30 14:01:07 slippey kernel: *pte = 00000000
Nov 30 14:01:07 slippey kernel: Entering kdb due to panic @ 0xc01afdcf
Nov 30 14:01:07 slippey kernel: eax = 0x00000000 ebx = 0xc7f900c0 ecx =
0x00000100 edx = 0x00000000
Nov 30 14:01:07 slippey kernel: esi = 0xc7f900c0 edi = 0x0000ffff esp =
0x00000400 eip = 0xc01afdcf
Nov 30 14:01:07 slippey kernel: ebp = 0xc6185b0c ss = 0x0000ffff cs =
0x00000010 eflags = 0x00010246
Nov 30 14:01:07 slippey kernel: ds = 0xc0120018 es = 0x00000018
origeax = 0xffffffff ®s = 0xc6185acc
Nov 30 14:01:07 slippey kernel: kdb> bt
Nov 30 14:01:07 slippey kernel: EBP EIP Function(args)
Nov 30 14:01:07 slippey kernel: 0xc6185b0c 0xc01afdcf
lvm_pv_get_number+0x3f( 0xc7f90000, 0xffff)
Nov 30 14:01:07 slippey kernel: 0xc6185b64 0xc01b0062
lvm_snapshot_fill_COW_page+0x10e( 0xc7f90000, 0xc65aca00, 0x0)
Nov 30 14:01:07 slippey kernel: 0xc6185bb8 0xc01ae2a3
lvm_do_lv_create+0x557( 0x0, 0xc6185c14, 0xc6185c14, 0xc5c09c40)
Nov 30 14:01:07 slippey kernel: 0xc6185da8 0xc01ad5a1
lvm_do_vg_create+0x47d( 0x0, 0x158bc0, 0xc5c09c40, 0xffffffe7)
Nov 30 14:01:07 slippey kernel: 0xc6185f90 0xc01aafaf
lvm_chr_ioctl+0x2bf( 0xc6160d70, 0xc5c09c40, 0x4004fe00, 0x158bc0,
0xc6184000)
Nov 30 14:01:07 slippey kernel: 0xc6185fbc 0xc0137ebd sys_ioctl+0x19d(
0x4, 0x4004fe00, 0x158bc0, 0x4, 0x158bc0)
Nov 30 14:01:07 slippey kernel: 0xbffffa88 0xc010b83c system_call
Nov 30 14:01:07 slippey kernel: Oops: 0000
Nov 30 14:01:07 slippey kernel: CPU: 0
Nov 30 14:01:07 slippey kernel: EIP: 0010:[lvm_pv_get_number+63/76]
Nov 30 14:01:07 slippey kernel: EFLAGS: 00010246
Nov 30 14:01:07 slippey kernel: eax: 00000000 ebx: c7f900c0 ecx:
00000100 edx: 00000000
Nov 30 14:01:07 slippey kernel: esi: c7f900c0 edi: 0000ffff ebp:
c6185b0c esp: c6185b08
Nov 30 14:01:07 slippey kernel: ds: 0018 es: 0018 ss: 0018
Nov 30 14:01:07 slippey kernel: Process vgchange (pid: 646, process nr:
38, stackpage=c6185000)
Nov 30 14:01:07 slippey kernel: Stack: c7f90000 c6185b64 c01b0062 c7f90000
0000ffff 0000007e c65aca00 c65aca00
Nov 30 14:01:07 slippey kernel: 00000900 00000000 c65aca00 c65acb68
c6185b50 c01b0a29 c03921c0 00000000
Nov 30 14:01:07 slippey kernel: 0000007d c65aca00 c01b0abe 00000060
00000020 c5ad8000 00000000 c6185bb8
Nov 30 14:01:07 slippey kernel: Call Trace:
[lvm_snapshot_fill_COW_page+270/468] (0) [lvm_do_lv_create+1367/1804] (88)
[lvm_do_vg_create+1149/1244] (84) [lvm_chr_ioctl+703/2032] (496)
[sys_ioctl+413/436] (488) [system_call+52/56] (44)
Nov 30 14:01:07 slippey kernel: Code: 8b 80 b0 01 00 00 5b 5e 5f 89 ec 5d
c3 55 89 e5 83 ec 18 57
On Thu, 30 Nov 2000, Andreas Dilger wrote:
> Date: Thu, 30 Nov 2000 12:48:06 -0700 (MST)
> From: Andreas Dilger <adilger@turbolinux.com>
> To: "Stephen C. Tweedie" <sct@redhat.com>
> Cc: Andreas Dilger <adilger@turbolinux.com>, linux-lvm@sistina.com,
> jweber@valinux.com, "Heinz J. Mauelshagen" <mauelshagen@sistina.com>
> Subject: Re: [linux-lvm] LVM 0.9 snapshot and ext3
>
> Stephen writes:
> > So, is somebody else doing a read? It could be ext3, conceivably, if
> > we had had an IO error previously on the buffer and the BH_uptodate
> > flag had been cleared: a subsequent access to the buffer by another
> > process would try to reread the buffer off disk, locking it
> > temporarily.
>
> Considering that there was the "Bad lvm_map in ll_rw_block" message just
> before the oops, this would lead to the situation you are talking about.
> In ll_rw_block the error path at "sorry:" clears BH_Dirty and BH_Uptodate,
> and calls b_end_io() but does not necessarily unlock the buffer?
>
> Looking at the code more closely, I see that I may have introduced a bug
> here, because I didn't understand what was going on:
>
> - in lvm_map() we call lvm_snapshot_COW() and lvm_write_COW_table(), set "ret"
> - we return the last "ret" value to ll_rw_block()
> - ll_rw_block() fails if the lvm_map() return value is non-zero
>
> Really, the lvm_map() call should NOT return an error if the snapshot write
> fails, because it needs to write the primary LV in this case. The
> snapshot LV may fill up, and then be removed, but this should not affect
> the I/O on the primary LV. There should only be an error if the primary
> LV mapping fails, which will only happen on an inactive or R/O LV, or if
> we try to write past the end of the LV.
>
> *** Stephen, in case of ANY error inside lvm_map() should it unlock the buffer?
> What do the low-level drivers do on an error? ll_rw_block() will already
> call b_end_io for us at "sorry:", but it still clearly causes a problem for
> ext3. This bug with the snapshots has only brought forth the *real* issue
> of what to in case any of the other errors in lvm_map() happens.
>
>
> The simple fix for this problem is to remove "ret" from lvm_map, and always
> returning "0" at the end. The first patch does this. The second part cleans
> up a few other error return values, to follow the normal kernel "-ve is error"
> standard, and remove a bunch of needless gotos. It is based on my reorg
> of lvm_map, so if the first patch (the one that should "hide" this bug)
> fails for you, just manually delete the 4 cases of "ret" in that function.
>
> Cheers, Andreas
> ===========================================================================
>
> --- lvm.c.orig Sun Nov 19 20:35:44 2000
> +++ lvm.c Thu Nov 30 12:17:17 2000
> @@ -1597,7 +1597,6 @@
> static int lvm_map(struct buffer_head *bh, int rw)
> {
> int minor = MINOR(bh->b_dev);
> - int ret = 0;
> ulong index;
> ulong pe_start;
> ulong size = bh->b_size >> 9;
> @@ -1715,11 +1716,15 @@
> if (lvm_snapshot_remap_block(&rdev_tmp, &rsector_tmp,
> pe_start, lv_ptr))
> continue;
> - /* create a new mapping */
> - if (!(ret = lvm_snapshot_COW(rdev_map, rsector_map,
> - pe_start, rsector_map,
> - lv_ptr)))
> - ret = lvm_write_COW_table_block(vg_this,lv_ptr);
> + /*
> + * Create a new mapping for this chunk. If it fails,
> + * it will remove the snapshot, but this should not
> + * return the error to ll_rw_block(), which would stop
> + * the I/O on the primary LV copy.
> + */
> + if (!lvm_snapshot_COW(rdev_map, rsector_map,
> + pe_start, rsector_map, lv_ptr))
> + (void)lvm_write_COW_table_block(vg_this,lv_ptr);
> }
> up(&lv->lv_snapshot_org->lv_snapshot_sem);
> } else { /* snapshot logical volume */
> @@ -1740,7 +1740,7 @@
> bh->b_rdev = rdev_map;
> bh->b_rsector = rsector_map;
>
> - return ret;
> + return 0;
> } /* lvm_map() */
>
>
> ===========================================================================
> --- lvm.c.orig Sun Nov 19 20:35:44 2000
> +++ lvm.c Thu Nov 30 12:17:17 2000
> @@ -2708,7 +2685,7 @@
> up(&lv_ptr->lv_snapshot_org->lv_snapshot_sem);
> vfree(lvbe_old);
> vfree(lvs_hash_table_old);
> - return 1;
> + return -ENOMEM;
> }
>
> for (e = 0; e < lv_ptr->lv_remap_ptr; e++)
> --- lvm-snap.c.orig Tue Nov 14 05:52:45 2000
> +++ lvm-snap.c Thu Nov 30 11:52:57 2000
> @@ -295,7 +305,10 @@
> if (brw_kiovec(WRITE, 1, &iobuf, snap_phys_dev,
> blocks, blksize_snap, 0) != blksize_snap)
> #endif
> - goto fail_raw_write;
> + {
> + reason = "write error";
> + goto error;
> + }
>
>
> /* initialization of next COW exception table block with zeroes */
> @@ -323,7 +336,10 @@
> if (brw_kiovec(WRITE, 1, &iobuf, snap_phys_dev,
> blocks, blksize_snap, 0) != blksize_snap)
> #endif
> - goto fail_raw_write;
> + {
> + reason = "write error";
> + goto error;
> + }
> }
>
>
> @@ -334,13 +350,9 @@
> return 0;
>
> /* slow path */
> - out:
> + error:
> lvm_drop_snapshot(lv_snap, reason);
> - return 1;
> + return -1;
> -
> - fail_raw_write:
> - reason = "write error";
> - goto out;
> }
>
> /*
> @@ -366,8 +378,10 @@
> int max_sectors, nr_sectors;
>
> /* check if we are out of snapshot space */
> - if (idx >= lv_snap->lv_remap_end)
> - goto fail_out_of_space;
> + if (idx >= lv_snap->lv_remap_end) {
> + reason = "out of space";
> + goto error;
> + }
>
> /* calculate physical boundaries of source chunk */
> pe_off = org_pe_start % chunk_size;
> @@ -401,8 +415,10 @@
> min_blksize = min(blksize_org, blksize_snap);
> max_sectors = KIO_MAX_SECTORS * (min_blksize>>9);
>
> - if (chunk_size % (max_blksize>>9))
> - goto fail_blksize;
> + if (chunk_size % (max_blksize>>9)) {
> + reason = "block size error";
> + goto error;
> + }
>
> while (chunk_size)
> {
> @@ -420,7 +436,10 @@
> if (brw_kiovec(READ, 1, &iobuf, org_phys_dev,
> blocks, blksize_org, 0) != (nr_sectors<<9))
> #endif
> - goto fail_raw_read;
> + {
> + reason = "read error";
> + goto error;
> + }
>
> lvm_snapshot_prepare_blocks(blocks, snap_start,
> nr_sectors, blksize_snap);
> @@ -431,7 +450,10 @@
> if (brw_kiovec(WRITE, 1, &iobuf, snap_phys_dev,
> blocks, blksize_snap, 0) != (nr_sectors<<9))
> #endif
> - goto fail_raw_write;
> + {
> + reason = "write error";
> + goto error;
> + }
> }
>
> #ifdef DEBUG_SNAPSHOT
> @@ -455,22 +477,9 @@
> return 0;
>
> /* slow path */
> - out:
> + error:
> lvm_drop_snapshot(lv_snap, reason);
> - return 1;
> + return -1;
> -
> - fail_out_of_space:
> - reason = "out of space";
> - goto out;
> - fail_raw_read:
> - reason = "read error";
> - goto out;
> - fail_raw_write:
> - reason = "write error";
> - goto out;
> - fail_blksize:
> - reason = "blocksize error";
> - goto out;
> }
>
> int lvm_snapshot_alloc_iobuf_pages(struct kiobuf * iobuf, int sectors)
> --
> Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
> \ would they cancel out, leaving him still hungry?"
> http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [linux-lvm] LVM 0.9 snapshot and ext3
2000-11-30 19:48 ` Andreas Dilger
2000-11-30 20:16 ` Andreas Dilger
2000-11-30 22:04 ` Jay Weber
@ 2000-12-01 15:06 ` Stephen C. Tweedie
2 siblings, 0 replies; 10+ messages in thread
From: Stephen C. Tweedie @ 2000-12-01 15:06 UTC (permalink / raw)
To: Andreas Dilger
Cc: Stephen C. Tweedie, linux-lvm, jweber, Heinz J. Mauelshagen
Hi,
On Thu, Nov 30, 2000 at 12:48:06PM -0700, Andreas Dilger wrote:
> Stephen writes:
> > So, is somebody else doing a read? It could be ext3, conceivably, if
> > we had had an IO error previously on the buffer and the BH_uptodate
> > flag had been cleared: a subsequent access to the buffer by another
> > process would try to reread the buffer off disk, locking it
> > temporarily.
>
> Considering that there was the "Bad lvm_map in ll_rw_block" message just
> before the oops, this would lead to the situation you are talking about.
> In ll_rw_block the error path at "sorry:" clears BH_Dirty and BH_Uptodate,
> and calls b_end_io() but does not necessarily unlock the buffer?
The end-io should do the unlock no matter the value of uptodate. I
should go and check what ext3 is doing on synchronous IO failures,
though.
> *** Stephen, in case of ANY error inside lvm_map() should it unlock the buffer?
ll_rw_block needs to unlock the buffer after the IO is complete, no
matter what the error. The LVM people will know better than I whether
the lvm_map() function is the right place to do that.
Cheers,
Stephen
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2000-12-01 15:06 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-11-29 6:09 [linux-lvm] LVM 0.9 snapshot and ext3 Jay Weber
2000-11-29 6:21 ` Jay Weber
2000-11-29 6:24 ` Jay Weber
2000-11-29 7:01 ` Andreas Dilger
2000-11-29 10:18 ` Andreas Dilger
2000-11-30 9:46 ` Stephen C. Tweedie
2000-11-30 19:48 ` Andreas Dilger
2000-11-30 20:16 ` Andreas Dilger
2000-11-30 22:04 ` Jay Weber
2000-12-01 15:06 ` Stephen C. Tweedie
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.