linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] Snapshot question...
@ 2008-04-22 10:30 Charles Marcus
  2008-04-22 10:47 ` Stephane Chazelas
  0 siblings, 1 reply; 18+ messages in thread
From: Charles Marcus @ 2008-04-22 10:30 UTC (permalink / raw)
  To: linux-lvm

Ok, now I have a question on how snapshots work...

I've read the following links:

http://tldp.org/HOWTO/LVM-HOWTO/snapshotintro.html

http://tldp.org/HOWTO/LVM-HOWTO/snapshots_backup.html

but am still unsure of something...

When a snapshot volume is created, it only needs to be large enough to 
accommodate any *changes* to the volume I'll be taking a snapshot of, 
correct? So, on a low volume mail server, 5GB should be way more than 
enough, especially if done in the middle of the night when it is 
essentially idling, right? I'm guessing that 500MB would probably be 
more than enough, but disk space is cheap, and I have plenty.

So, I have allocated 5GB of 'free space' in my volume group for snapshot 
use.

What I don't understand is... why am I mounting and then backing up the 
newly created snapshot volume, if it only contains the *changes* to the 
volume I really want to backup - which in my case is /var/virtual?

If the snapshot volume is only 5GB, how, by backing up *this* volume, am 
I backing up over 100GB of data that is in a different volume?

Tia for any responses,

-- 

Best regards,

Charles

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 10:30 [linux-lvm] Snapshot question Charles Marcus
@ 2008-04-22 10:47 ` Stephane Chazelas
  2008-04-22 16:12   ` Brian J. Murrell
  2008-04-22 16:28   ` [linux-lvm] Snapshot question Charles Marcus
  0 siblings, 2 replies; 18+ messages in thread
From: Stephane Chazelas @ 2008-04-22 10:47 UTC (permalink / raw)
  To: LVM general discussion and development

2008-04-22 06:30:44 -0400, Charles Marcus:
[...]
> So, I have allocated 5GB of 'free space' in my volume group for snapshot 
> use.
>
> What I don't understand is... why am I mounting and then backing up the 
> newly created snapshot volume, if it only contains the *changes* to the 
> volume I really want to backup - which in my case is /var/virtual?
>
> If the snapshot volume is only 5GB, how, by backing up *this* volume, am I 
> backing up over 100GB of data that is in a different volume?
[...]

When you create a snapshot, LVM2 does 4 things:

- it allocates a normal (linear) storage COW (copy-on-write)
  volume of the specified size which you don't normally access directly.
- creates a "snapshot" volume which is a virtual volume that
  allows you to access the frozen version of the snaphost
  volume.

(if it hasn't been done before):
- it dupplicates the original volume as xxxx-real
- it changes the original volume to a be "snapshot-origin" type
  which refers to the "real" volume and the "COW" volume.

When you read from the "snapshot" volume (the one you're backing
up), data that hasn't been modified since the snapshot is read
from the "real" volume and from the "COW" volume otherwise.

When you read from your original volume (that has been changed
to be of "snapshot-origin" type), you actually read from the
"real" volume (the original version of the original volume), but
when you write to it, the system makes sure that unless it has
been done already, the block you're modifying are being copied
first from the "real" volume to COW volume before being modified
in the "real" volume.

So to sum up, the "snapshot" volume you are creating is a
"virtual" volume that is a front end to both the snapshot
storage volume ("COW") and the original real volume ("real").

Hope this clarifies a bit,
Stephane

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 10:47 ` Stephane Chazelas
@ 2008-04-22 16:12   ` Brian J. Murrell
  2008-04-22 16:38     ` Stephane Chazelas
                       ` (2 more replies)
  2008-04-22 16:28   ` [linux-lvm] Snapshot question Charles Marcus
  1 sibling, 3 replies; 18+ messages in thread
From: Brian J. Murrell @ 2008-04-22 16:12 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 647 bytes --]

On Tue, 2008-04-22 at 11:47 +0100, Stephane Chazelas wrote:
> 
> but
> when you write to it, the system makes sure that unless it has
> been done already, the block you're modifying are being copied
> first from the "real" volume to COW volume before being modified
> in the "real" volume.

And to be clear, the COW volumes of _all_ snapshots[1].  This is where
the snapshot scaling problem arises.

Zumastor claim to have solved that with their snapshots.  I'd love to
have time to play with it, indeed.

If Zumastor are here listening, can you answer, are there any plans to
get your snapshots into the main kernel tree?

b.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 10:47 ` Stephane Chazelas
  2008-04-22 16:12   ` Brian J. Murrell
@ 2008-04-22 16:28   ` Charles Marcus
  2008-04-22 16:47     ` dave
  2008-04-22 16:52     ` Stephane Chazelas
  1 sibling, 2 replies; 18+ messages in thread
From: Charles Marcus @ 2008-04-22 16:28 UTC (permalink / raw)
  To: LVM general discussion and development

On 4/22/2008, Stephane Chazelas (stephane.chazelas@emerson.com) wrote:
> When you read from your original volume (that has been changed
> to be of "snapshot-origin" type), you actually read from the
> "real" volume (the original version of the original volume), but
> when you write to it, the system makes sure that unless it has
> been done already, the block you're modifying are being copied
> first from the "real" volume to COW volume before being modified
> in the "real" volume.
> 
> So to sum up, the "snapshot" volume you are creating is a
> "virtual" volume that is a front end to both the snapshot
> storage volume ("COW") and the original real volume ("real").
> 
> Hope this clarifies a bit,

Thanks for trying, but no, that just made my head hurt...

;)

Seriously... if the snapshot volume that I'm creating is a front end to 
BOTH, when I back it up, I guess LVM just 'knows' that I mean to backup 
the 'original'?

Is there a graphical outline of how this works? I seem to do better with 
visualizations...

-- 

Best regards,

Charles

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 16:12   ` Brian J. Murrell
@ 2008-04-22 16:38     ` Stephane Chazelas
  2008-04-22 16:54       ` Dan Kegel
  2008-04-22 16:51     ` Dan Kegel
  2008-04-24  3:57     ` [linux-lvm] Snapshot question... [scaling problem] Ross Boylan
  2 siblings, 1 reply; 18+ messages in thread
From: Stephane Chazelas @ 2008-04-22 16:38 UTC (permalink / raw)
  To: LVM general discussion and development

2008-04-22 12:12:14 -0400, Brian J. Murrell:
> On Tue, 2008-04-22 at 11:47 +0100, Stephane Chazelas wrote:
> > 
> > but
> > when you write to it, the system makes sure that unless it has
> > been done already, the block you're modifying are being copied
> > first from the "real" volume to COW volume before being modified
> > in the "real" volume.
> 
> And to be clear, the COW volumes of _all_ snapshots[1].  This is where
> the snapshot scaling problem arises.
> 
> Zumastor claim to have solved that with their snapshots.  I'd love to
> have time to play with it, indeed.
[...]

Indeed,

actually I've spent the last 3 days trying to put zumastor (well
actually ddsnap as I didn't use zumastor in the end) in place
here.

The fact that there are user space daemons involved in the
process make it very difficult to use it for the root file
system. I had to use a number of hacks for that.

Also, it seems that you can't snapshot a live (mounted) file
system as you would do with LVM2 unless you have setup that FS
beforehand as a virtual ddsnap device (LVM2 is able to suspend
the device, and reload it as a snapshot_origin for that, I
couldn't manage to do the same with ddsnap).

I also tried applying the patches to 2.6.25 and got some oops
(2.6.24.2 is fine though).

Once I've finalised it, I can post what I've come up with if
anyone is interested.

The least I can say is that I learnt a lot about grub2,
initramfs-tools, device-mapper, raid/mdadm, lvm2 in the past few
days ;).

Cheers,
Stephane

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 16:28   ` [linux-lvm] Snapshot question Charles Marcus
@ 2008-04-22 16:47     ` dave
  2008-04-22 17:09       ` Charles Marcus
  2008-04-22 16:52     ` Stephane Chazelas
  1 sibling, 1 reply; 18+ messages in thread
From: dave @ 2008-04-22 16:47 UTC (permalink / raw)
  To: LVM general discussion and development

> Seriously... if the snapshot volume that I'm creating is a front end to 
> BOTH, when I back it up, I guess LVM just 'knows' that I mean to backup 
> the 'original'?

The snapshot *looks like* a copy of the original volume for all intents and purposes.  You can ignore the fact that it really only saves the difference between the two volumes.  As far as you are concerned, the snapshot is identical to the volume it was created from at the time of the snapshot.  

All the COW stuff happens behind the scenes.  LVM knows which blocks to get from the origin, and which blocks to get from the snapshot, because it knows which blocks have changed since the snapshot was created.

if you want to get more confused check out this google image search
http://images.google.com/images?hl=en&q=copy+on+write&btnG=Search+Images&gbv=2

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 16:12   ` Brian J. Murrell
  2008-04-22 16:38     ` Stephane Chazelas
@ 2008-04-22 16:51     ` Dan Kegel
  2008-04-24  3:57     ` [linux-lvm] Snapshot question... [scaling problem] Ross Boylan
  2 siblings, 0 replies; 18+ messages in thread
From: Dan Kegel @ 2008-04-22 16:51 UTC (permalink / raw)
  To: LVM general discussion and development

2008/4/22 Brian J. Murrell <brian@interlinx.bc.ca>:
>  Zumastor claim to have solved that with their snapshots.  I'd love to
>  have time to play with it, indeed.
>
>  If Zumastor are here listening, can you answer, are there any plans to
>  get your snapshots into the main kernel tree?

We're listening.
The patches are a little invasive (they touch the VM
system -- zumastor triggers an old deadlock in the kernel,
and we have a fix for it), so
it's not clear they will be accepted unless zumastor
has proven itself in the field.  We're working on
increasing our usability and performance now
to attract more users.  i.e. we're slowly making our ubuntu
packages (including the kernel flavor) follow all the
debian and ubuntu rules, and are now building them
in a ppa at launchpad for hardy.   We also have fairly
good Gentoo support.  (Sorry, we didn't have any Fedora
people on the project, so no Fedora packaging yet.)
- Dan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 16:28   ` [linux-lvm] Snapshot question Charles Marcus
  2008-04-22 16:47     ` dave
@ 2008-04-22 16:52     ` Stephane Chazelas
  1 sibling, 0 replies; 18+ messages in thread
From: Stephane Chazelas @ 2008-04-22 16:52 UTC (permalink / raw)
  To: LVM general discussion and development

2008-04-22 12:28:10 -0400, Charles Marcus:
[...]
>> So to sum up, the "snapshot" volume you are creating is a
>> "virtual" volume that is a front end to both the snapshot
>> storage volume ("COW") and the original real volume ("real").
>>
>> Hope this clarifies a bit,
>
> Thanks for trying, but no, that just made my head hurt...
>
> ;)
>
> Seriously... if the snapshot volume that I'm creating is a front end to 
> BOTH, when I back it up, I guess LVM just 'knows' that I mean to backup the 
> 'original'?
>
> Is there a graphical outline of how this works? I seem to do better with 
> visualizations...
[...]

I can try another wording.

Your snapshot device is a *virtual* device. And if you do a cat
/dev/vg/snapshot, you'll get 100GB worth of data which will be
exactly the same as you would have gotten if you had done a cat
/dev/vg/original at the time you did the snapshot.

To make that virtual snapshot work, LVM uses internally another
this time real device, which you don't access directly. You can
do a cat /dev/mapper/vg-snapshot-cow, you'll get 5 GB of data,
but that data will be useless to you, it's in a special format
recognised by the device-mapper used to store only the blocks
that were changed in your original device since the snapshot.

The "dmsetup status" or "lvdisplay" commands should be able to
tell you how much of the COW volume has already been allocated
to store the "modifications". When all the space there has been
used, the virtual device will stop working altogether.

is that better?
Stephane

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 16:38     ` Stephane Chazelas
@ 2008-04-22 16:54       ` Dan Kegel
  2008-04-22 17:39         ` Dan Kegel
  0 siblings, 1 reply; 18+ messages in thread
From: Dan Kegel @ 2008-04-22 16:54 UTC (permalink / raw)
  To: LVM general discussion and development

On Tue, Apr 22, 2008 at 9:38 AM, Stephane Chazelas
<stephane.chazelas@emerson.com> wrote:
>  actually I've spent the last 3 days trying to put zumastor (well
>  actually ddsnap as I didn't use zumastor in the end) in place
>  here.
>
>  The fact that there are user space daemons involved in the
>  process make it very difficult to use it for the root file
>  system. I had to use a number of hacks for that.

You're braver than us.  We haven't tried using it for the root fs yet.

>  Also, it seems that you can't snapshot a live (mounted) file
>  system as you would do with LVM2 unless you have setup that FS
>  beforehand as a virtual ddsnap device (LVM2 is able to suspend
>  the device, and reload it as a snapshot_origin for that, I
>  couldn't manage to do the same with ddsnap).

Hmm.  I'll bring up that use case with Dan Phillips, see what he says.

>  I also tried applying the patches to 2.6.25 and got some oops
>  (2.6.24.2 is fine though).
>
>  Once I've finalised it, I can post what I've come up with if
>  anyone is interested.

Please do (heck, post it to the zumastor list, too).
- Dan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 16:47     ` dave
@ 2008-04-22 17:09       ` Charles Marcus
  2008-04-23 10:08         ` Charles Marcus
  0 siblings, 1 reply; 18+ messages in thread
From: Charles Marcus @ 2008-04-22 17:09 UTC (permalink / raw)
  To: LVM general discussion and development

On 4/22/2008, dave@frop.net (dave@frop.net) wrote:
> The snapshot *looks like* a copy of the original volume for all 
> intents and purposes.  You can ignore the fact that it really only 
> saves the difference between the two volumes.  As far as you are 
> concerned, the snapshot is identical to the volume it was created 
> from at the time of the snapshot. 

and

> Your snapshot device is a *virtual* device. And if you do a cat
> /dev/vg/snapshot, you'll get 100GB worth of data which will be
> exactly the same as you would have gotten if you had done a cat
> /dev/vg/original at the time you did the snapshot.

<lightbulb>
Ahh... like say a link (soft? hard? not that it matters)...

Ok, makes sense now...

And yes, I figured the COW stuff was LVM voodoo-magic, so its good to 
know that I don't 'need to know'... :)
</lightbulb>

At least,  I hope I got it now...

Lastly...

Should the snapshot volume fill up, what happens to the original? I'm 
guessing that it would be protected from any kind of data loss?

Thanks for the hand-holding...

-- 

Best regards,

Charles

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 16:54       ` Dan Kegel
@ 2008-04-22 17:39         ` Dan Kegel
  2008-04-23 11:49           ` Stephane Chazelas
  0 siblings, 1 reply; 18+ messages in thread
From: Dan Kegel @ 2008-04-22 17:39 UTC (permalink / raw)
  To: LVM general discussion and development

On Tue, Apr 22, 2008 at 9:54 AM, Dan Kegel <dank@kegel.com> wrote:
>  >  Also, it seems that you can't snapshot a live (mounted) file
>  >  system as you would do with LVM2 unless you have setup that FS
>  >  beforehand as a virtual ddsnap device (LVM2 is able to suspend
>  >  the device, and reload it as a snapshot_origin for that, I
>  >  couldn't manage to do the same with ddsnap).
>
>  Hmm.  I'll bring up that use case with Dan Phillips, see what he says.

Heh.  You cheated.  You're snapshotting devices already
in LVM2, so it already has control.

Perhaps the thing to do is to integrate ddsnap into lvm;
that way one could choose between traditional lvm
snapshots for compatbility, or spiffy new shared-storage
ddsnap snapshots for scalability.

If people try ddsnap / zumastor and think it's worth it,
maybe we could have a look at merging it with lvm somehow.
That would be an interesting job.
- Dan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 17:09       ` Charles Marcus
@ 2008-04-23 10:08         ` Charles Marcus
  0 siblings, 0 replies; 18+ messages in thread
From: Charles Marcus @ 2008-04-23 10:08 UTC (permalink / raw)
  To: LVM general discussion and development

On 4/22/2008, Charles Marcus (CMarcus@Media-Brokers.com) wrote:
> Lastly...
> 
> Should the snapshot volume fill up, what happens to the original? I'm 
> guessing that it would be protected from any kind of data loss?

Never mind... tfm answers this (automatically released)...

Hmmm... this appears to be a dev oriented list? Is there one aimed more 
at lowly users? I hate bothering the magicians behind the curtain with 
simple/dumb end-user questions like above...

-- 

Best regards,

Charles

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question...
  2008-04-22 17:39         ` Dan Kegel
@ 2008-04-23 11:49           ` Stephane Chazelas
  0 siblings, 0 replies; 18+ messages in thread
From: Stephane Chazelas @ 2008-04-23 11:49 UTC (permalink / raw)
  To: LVM general discussion and development

2008-04-22 10:39:59 -0700, Dan Kegel:
> On Tue, Apr 22, 2008 at 9:54 AM, Dan Kegel <dank@kegel.com> wrote:
> >  >  Also, it seems that you can't snapshot a live (mounted) file
> >  >  system as you would do with LVM2 unless you have setup that FS
> >  >  beforehand as a virtual ddsnap device (LVM2 is able to suspend
> >  >  the device, and reload it as a snapshot_origin for that, I
> >  >  couldn't manage to do the same with ddsnap).
> >
> >  Hmm.  I'll bring up that use case with Dan Phillips, see what he says.
> 
> Heh.  You cheated.  You're snapshotting devices already
> in LVM2, so it already has control.
> 
> Perhaps the thing to do is to integrate ddsnap into lvm;
> that way one could choose between traditional lvm
> snapshots for compatbility, or spiffy new shared-storage
> ddsnap snapshots for scalability.
> 
> If people try ddsnap / zumastor and think it's worth it,
> maybe we could have a look at merging it with lvm somehow.
> That would be an interesting job.
[...]

Hi Dan,

What I meant is that, with LVM2 snapshots, before you create
your first snapshot, you have your LV A created as a "linear"
(or stripe or mirror) mapping of a /physical/ block device.

So you have for instance:

$ sudo dmsetup table
vg-A: 0 199991296 linear 9:0 384

And possibly the /dev/mapper/vg-A is mounted on /.

In order to be able to do a snapshot of that, you have to change
it to a "snapshot-origin" type. That's what lvcreate -s does.

if you do a lvcreate -s -n snap /dev/vg/A

- it creates a /dev/mapper/vg-snap-cow volume
- it creates a copy (a dm device with the same table) of
  /deb/mapper/vg-A as /dev/mapper/vg-A-real
- it suspends /dev/mapper/vg-A (as if doing dmsetup suspend
  vg-A)
- it does the equivalent of:
  echo <start> <end> snapshot /dev/mapper/vg-A-real \
  /dev/mapper/vg-A-cow P 16 | dmsetup reload vg-A
  So that vg-A is changed on the fly. That step wouldn't work
  without the suspend.
- resumes /dev/mapper/vg-A

I tried to do something similar (not on / but on a /dev/loop
test) with ddsnap, but it didn't work, the commands were
successful, but the vg-A was still seen as a "linear" device in
the output of dmsetup table.

I think this is different from having ddsnap integrated in LVM2.
An advantage of making LVM2 aware of ddsnap would be so that
ddsnap devices can be more easily brought back up upon startup.
But that would mean "vgchange -ay" would have to start the agent
and server which would still raise a number of issues when run
in initrds.

Cheers,
Stephane

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question... [scaling problem]
  2008-04-22 16:12   ` Brian J. Murrell
  2008-04-22 16:38     ` Stephane Chazelas
  2008-04-22 16:51     ` Dan Kegel
@ 2008-04-24  3:57     ` Ross Boylan
  2008-04-24  4:10       ` Dan Kegel
  2 siblings, 1 reply; 18+ messages in thread
From: Ross Boylan @ 2008-04-24  3:57 UTC (permalink / raw)
  To: LVM general discussion and development


On Tue, 2008-04-22 at 12:12 -0400, Brian J. Murrell wrote:
> On Tue, 2008-04-22 at 11:47 +0100, Stephane Chazelas wrote:
> > 
> > but
> > when you write to it, the system makes sure that unless it has
> > been done already, the block you're modifying are being copied
> > first from the "real" volume to COW volume before being modified
> > in the "real" volume.
> 
> And to be clear, the COW volumes of _all_ snapshots[1].  This is where
> the snapshot scaling problem arises.
Could someone say a bit more, because I definitely don't follow this.
If the [1] after snaphots means something, I don't know what.

I've seen references to this scaling problem before, and have never
quite gotten it.  The quoted paragraph above sounds as if it means
whenever there is a write to any volume, the COW tables for all
snapshots get the contents that are about to be overwritten.

I have multiple snapshots active, and it doesn't seem to work this way.
Some volumes have big snapshot volumes, into which substantial material
is written.  Others have small snapshot volumes, into which very little
is written.  In particular, the little ones don't seem to be getting the
writes to the big ones.  I judge how much of the COW is in use from the
output of lvdisplay (attached at bottom)

If all COW volumes get the overwritten material, then they'd all have
the same size in use.  And they don't seem to.

As I said, I'm missing something.  Could someone enlighten me?

Ross Boylan

lvdisplay for big volume shows
 LV Name                /dev/daisy/_var_spool_cyrus
 VG Name                daisy
 LV UUID                vUs6Bj-2DRx-4OVt-7FBr-K2dW-3gTV-AHpNRZ
 LV Write Access        read/write
 LV snapshot status     active destination for /dev/daisy/cyrspool
 LV Status              available
 # open                 0
 LV Size                23.62 GB
 Current LE             756
 COW-table size         12.00 GB
 COW-table LE           384
 Allocated to snapshot  29.78% 
 Snapshot chunk size    4.00 KB
 Segments               1
 Allocation             inherit
 Read ahead sectors     auto
 - currently set to     256
 Block device           254:41

and for the little one it shows
 LV Name                /dev/daisy/_var_lib_cyrus
 VG Name                daisy
 LV UUID                qkcv3k-XBYW-2gcx-pjbM-oF0g-Mlfk-P28QcH
 LV Write Access        read/write
 LV snapshot status     active destination for /dev/daisy/cyrlib
 LV Status              available
 # open                 0
 LV Size                96.00 MB
 Current LE             3
 COW-table size         128.00 MB
 COW-table LE           4
 Allocated to snapshot  33.32% 
 Snapshot chunk size    4.00 KB
 Segments               1
 Allocation             inherit
 Read ahead sectors     auto
 - currently set to     256
 Block device           254:38

I think this means the first one used 3.6G=12G*.30 while the second used
42.7MB = 128MB*.33.  The amount written to the first one exceeds the
total size available (128MB) for the 2nd snapshot.  I'm running Linux
kernel 2.6.18 as packaged for Debian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question... [scaling problem]
  2008-04-24  3:57     ` [linux-lvm] Snapshot question... [scaling problem] Ross Boylan
@ 2008-04-24  4:10       ` Dan Kegel
  2008-04-24 14:21         ` Larry Dickson
  0 siblings, 1 reply; 18+ messages in thread
From: Dan Kegel @ 2008-04-24  4:10 UTC (permalink / raw)
  To: LVM general discussion and development

On Wed, Apr 23, 2008 at 8:57 PM, Ross Boylan <ross@biostat.ucsf.edu> wrote:
>  > > but
>  > > when you write to it, the system makes sure that unless it has
>  > > been done already, the block you're modifying are being copied
>  > > first from the "real" volume to COW volume before being modified
>  > > in the "real" volume.
>  >
>  > And to be clear, the COW volumes of _all_ snapshots[1].  This is where
>  > the snapshot scaling problem arises.
>  Could someone say a bit more, because I definitely don't follow this.

If you have a single original volume, and you keep ten snapshots
of it, and then you write a block to the original volume, you
may end up needing to write eleven blocks.  Ouch!

Thus the write overhead of LVM snapshots scales poorly with the
number of snapshots per volume.

Note that LVM snapshots scale well with the number of volumes,
but that's not interesting or surprising, as each volume is independent.
- Dan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question... [scaling problem]
  2008-04-24  4:10       ` Dan Kegel
@ 2008-04-24 14:21         ` Larry Dickson
  2008-04-24 15:59           ` Stuart D. Gathman
  0 siblings, 1 reply; 18+ messages in thread
From: Larry Dickson @ 2008-04-24 14:21 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 2080 bytes --]

There is also a subtle point here. If one of the older snapshots already
"saw" a change to that block, which came on before a newer snapshot was
started, then you will write to the new COW and not to the old. But as Dan
says, that can still leave a lot of sloshing around... especially with data
of the "hit each one rarely" type.

There's an almost trivial variant on this, where you keep the
(read-only) snapshots in a time-ordered sequence, and freeze the last
snapshot COW at the same moment as you start the next snapshot. Then writing
only ever hits the new snapshot COW, and reading from any older snapshot
(virtual) volume involves figuring out which is the first after that to hold
the block, but still involves reading only one block. I wonder why LVM does
not do this. Perhaps Zumastor does? Or somebody else?

Larry Dickson

On 4/23/08, Dan Kegel <dank@kegel.com> wrote:
>
> On Wed, Apr 23, 2008 at 8:57 PM, Ross Boylan <ross@biostat.ucsf.edu>
> wrote:
> >  > > but
> >  > > when you write to it, the system makes sure that unless it has
> >  > > been done already, the block you're modifying are being copied
> >  > > first from the "real" volume to COW volume before being modified
> >  > > in the "real" volume.
> >  >
> >  > And to be clear, the COW volumes of _all_ snapshots[1].  This is where
> >  > the snapshot scaling problem arises.
> >  Could someone say a bit more, because I definitely don't follow this.
>
> If you have a single original volume, and you keep ten snapshots
> of it, and then you write a block to the original volume, you
> may end up needing to write eleven blocks.  Ouch!
>
> Thus the write overhead of LVM snapshots scales poorly with the
> number of snapshots per volume.
>
> Note that LVM snapshots scale well with the number of volumes,
> but that's not interesting or surprising, as each volume is independent.
> - Dan
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

[-- Attachment #2: Type: text/html, Size: 3231 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question... [scaling problem]
  2008-04-24 14:21         ` Larry Dickson
@ 2008-04-24 15:59           ` Stuart D. Gathman
  2008-04-24 17:19             ` Larry Dickson
  0 siblings, 1 reply; 18+ messages in thread
From: Stuart D. Gathman @ 2008-04-24 15:59 UTC (permalink / raw)
  To: LVM general discussion and development

On Thu, 24 Apr 2008, Larry Dickson wrote:

> There's an almost trivial variant on this, where you keep the
> (read-only) snapshots in a time-ordered sequence, and freeze the last
> snapshot COW at the same moment as you start the next snapshot. Then writing
> only ever hits the new snapshot COW, and reading from any older snapshot
> (virtual) volume involves figuring out which is the first after that to hold
> the block, but still involves reading only one block. I wonder why LVM does
> not do this. Perhaps Zumastor does? Or somebody else?

Then you can't delete an older snapsnot until you delete all newer ones.

Zumastor works by using one COW table shared between all snapshots
for a volume.  Blocks are added to the COW in time order.  The origin
ignores COW blocks before the last time point (block offset), writing a new COW
block for any modified since that time point.  The snapshots also use
timepoints in a way that is straightforward, but I don't want to think
about it at the moment :-)

-- 
	      Stuart D. Gathman <stuart@bmsi.com>
    Business Management Systems Inc.  Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [linux-lvm] Snapshot question... [scaling problem]
  2008-04-24 15:59           ` Stuart D. Gathman
@ 2008-04-24 17:19             ` Larry Dickson
  0 siblings, 0 replies; 18+ messages in thread
From: Larry Dickson @ 2008-04-24 17:19 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]

>Then you can't delete an older snapsnot until you delete all newer ones.

Not true of what I was proposing - are we talking past each other? If snap 0
is the current (live) COW, and snap -k refers to time(-k) = time(snap 0) -
k*(interval), then reading the virtual
data for time(-k) involves looking at snap -k, then snap -k+1, ... snap 0,
current data; but stopping the first time your block gets a hit. The only
point with a race is {snap 0, current data}. So you can't delete a NEWER
snapshot until you delete all OLDER ones (because the virtual older snaps
need the newer COWs). That seems a small price to pay, since normally you
throw them away oldest first.

Larry

On 4/24/08, Stuart D. Gathman <stuart@bmsi.com> wrote:
>
> On Thu, 24 Apr 2008, Larry Dickson wrote:
>
> > There's an almost trivial variant on this, where you keep the
> > (read-only) snapshots in a time-ordered sequence, and freeze the last
> > snapshot COW at the same moment as you start the next snapshot. Then
> writing
> > only ever hits the new snapshot COW, and reading from any older snapshot
> > (virtual) volume involves figuring out which is the first after that to
> hold
> > the block, but still involves reading only one block. I wonder why LVM
> does
> > not do this. Perhaps Zumastor does? Or somebody else?
>
> Then you can't delete an older snapsnot until you delete all newer ones.
>
> Zumastor works by using one COW table shared between all snapshots
> for a volume.  Blocks are added to the COW in time order.  The origin
> ignores COW blocks before the last time point (block offset), writing a
> new COW
> block for any modified since that time point.  The snapshots also use
> timepoints in a way that is straightforward, but I don't want to think
> about it at the moment :-)
>
> --
>              Stuart D. Gathman <stuart@bmsi.com>
>    Business Management Systems Inc.  Phone: 703 591-0911 Fax: 703 591-6154
> "Confutatis maledictis, flammis acribus addictis" - background song for
> a Microsoft sponsored "Where do you want to go from here?" commercial.
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

[-- Attachment #2: Type: text/html, Size: 3004 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2008-04-24 17:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-22 10:30 [linux-lvm] Snapshot question Charles Marcus
2008-04-22 10:47 ` Stephane Chazelas
2008-04-22 16:12   ` Brian J. Murrell
2008-04-22 16:38     ` Stephane Chazelas
2008-04-22 16:54       ` Dan Kegel
2008-04-22 17:39         ` Dan Kegel
2008-04-23 11:49           ` Stephane Chazelas
2008-04-22 16:51     ` Dan Kegel
2008-04-24  3:57     ` [linux-lvm] Snapshot question... [scaling problem] Ross Boylan
2008-04-24  4:10       ` Dan Kegel
2008-04-24 14:21         ` Larry Dickson
2008-04-24 15:59           ` Stuart D. Gathman
2008-04-24 17:19             ` Larry Dickson
2008-04-22 16:28   ` [linux-lvm] Snapshot question Charles Marcus
2008-04-22 16:47     ` dave
2008-04-22 17:09       ` Charles Marcus
2008-04-23 10:08         ` Charles Marcus
2008-04-22 16:52     ` Stephane Chazelas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).