* [RFC] Multiple Snapshots - Manageability problem
@ 2007-01-11 18:18 ` Vijai Babu Madhavan
0 siblings, 0 replies; 10+ messages in thread
From: Vijai Babu Madhavan @ 2007-01-11 18:18 UTC (permalink / raw)
To: evms-devel, dm-devel, linux-lvm
Hi,
The problem of DM snapshots with multiple snapshots have been discussed
in the lists quiet a bit (Most recently @
https://www.redhat.com/archives/dm-devel/2006-October/msg00034.html).
We are currently in the process of building a DM snapshot target that scales
well with many snapshots (so that the changed blocks don't get copied to each
snapshot). In this process, I would also like to validate an assumption.
Today, when a single snapshot gets created, a new cow device of a given size
is also created. IMO, there are two problems with this approach:
a) It is difficult to predict the size of the cow device, which requires a prediction
of the number of writes would go into the origin volume during the snapshot
life cycle. It is difficult to get this prediction right, as very high value reduces
utilization and low value increases the chances of snapshot becoming full.
b) A new cow device needs to be created every time.
This really gets messy and creates a management problem once many
snapshots of a given origin are created, and gets worse with multiple origins.
I am thinking, having a single device that would hold the cow blocks of any
number of snapshots of a given origin (or more) would help solve this issue
(Apart from this, having a single device helps share the cow blocks among
snapshots very effectively in a variety of scenarios).
But, it does require that LVM and EVMS be changed to suit this model and
also makes the snapshot target quiet complex.
I would like to receive some comments about what users, developers
and others think about this.
Thanks,
Vijai
P.S:- BTW, apologizes for cross posting.
_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 10+ messages in thread
* [linux-lvm] [RFC] Multiple Snapshots - Manageability problem
@ 2007-01-11 18:18 ` Vijai Babu Madhavan
0 siblings, 0 replies; 10+ messages in thread
From: Vijai Babu Madhavan @ 2007-01-11 18:18 UTC (permalink / raw)
To: evms-devel, dm-devel, linux-lvm
Hi,
The problem of DM snapshots with multiple snapshots have been discussed
in the lists quiet a bit (Most recently @
https://www.redhat.com/archives/dm-devel/2006-October/msg00034.html).
We are currently in the process of building a DM snapshot target that scales
well with many snapshots (so that the changed blocks don't get copied to each
snapshot). In this process, I would also like to validate an assumption.
Today, when a single snapshot gets created, a new cow device of a given size
is also created. IMO, there are two problems with this approach:
a) It is difficult to predict the size of the cow device, which requires a prediction
of the number of writes would go into the origin volume during the snapshot
life cycle. It is difficult to get this prediction right, as very high value reduces
utilization and low value increases the chances of snapshot becoming full.
b) A new cow device needs to be created every time.
This really gets messy and creates a management problem once many
snapshots of a given origin are created, and gets worse with multiple origins.
I am thinking, having a single device that would hold the cow blocks of any
number of snapshots of a given origin (or more) would help solve this issue
(Apart from this, having a single device helps share the cow blocks among
snapshots very effectively in a variety of scenarios).
But, it does require that LVM and EVMS be changed to suit this model and
also makes the snapshot target quiet complex.
I would like to receive some comments about what users, developers
and others think about this.
Thanks,
Vijai
P.S:- BTW, apologizes for cross posting.
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC] Multiple Snapshots - Manageability problem
2007-01-11 18:18 ` [linux-lvm] " Vijai Babu Madhavan
@ 2007-01-11 21:34 ` Wilson, Christopher J
-1 siblings, 0 replies; 10+ messages in thread
From: Wilson, Christopher J @ 2007-01-11 21:34 UTC (permalink / raw)
To: device-mapper development, evms-devel, linux-lvm
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 5182 bytes --]
I haven't read through all of these options yet (but I will). I will
say that synthesizing all your cow objects into one pool will be
difficult. You're going to have issues with garbage collection of old
copies and may have to build in some scavenge or compress functions
which will take system resources. From my experience with disk based
de-duplication technologies you're heading down a hole which can be a
dark place. There are performance issues and maintaining all those
pointers is problematic. The virtual pool sounds good, and works very
will for primary storage functions (3PAR) but in practice for backup
applications with virtual pools for deduplication it's not been so hot.
I'm not clear what the issue is with maintaining multiple cow snapshots.
Just exactly how many are users asking for? Keeping more than a few cow
snaps online is not using the function for what it was meant for. COW
technology is for immediate rollback (to me) and not for long term
backup images. Sizing is an issue that will not go away and is not
resolvable in any low level OS code, this is a business/user issue.
Most customers don't even know how much data they're going to have much
less what their average write rates are, and I don't envision a cow pool
as solving the sizing issue.
If I had my way I'd rather see energy put into cow technology for use as
a disk cache for backup applications and tighter integration with those
apps. Better still would be for interfaces from business level
applications (Oracle, MySQL, etc) to quiece IO, flush buffers, and take
a consistent copy of the application, state and all. Putting together
an application level copy on hardware, being able to move that through a
tighter workflow to backup media through a common API would be my
preference instead of having each user create their own individual
"glue" code. If you look into SNIA's SMI-S (Storage Management API)
copy services package there may already be a template for this. I'd say
at least that supporting SMI-S Copy Services through that API is
desirable because a lot of the SRM application today are on their way to
leveraging that code.
Christopher Wilson
Storage Architect
Verizon Business
IT Solutions - IP Application Hosting
240 264 4136
vnet: 364 4136
-----Original Message-----
From: dm-devel-bounces@redhat.com [mailto:dm-devel-bounces@redhat.com]
On Behalf Of Vijai Babu Madhavan
Sent: Thursday, January 11, 2007 1:18 PM
To: evms-devel@lists.sourceforge.net; dm-devel@redhat.com;
linux-lvm@redhat.com
Subject: [dm-devel] [RFC] Multiple Snapshots - Manageability problem
Hi,
The problem of DM snapshots with multiple snapshots have been discussed
in the lists quiet a bit (Most recently @
https://www.redhat.com/archives/dm-devel/2006-October/msg00034.html).
We are currently in the process of building a DM snapshot target that
scales well with many snapshots (so that the changed blocks don't get
copied to each snapshot). In this process, I would also like to validate
an assumption.
Today, when a single snapshot gets created, a new cow device of a given
size is also created. IMO, there are two problems with this approach:
a) It is difficult to predict the size of the cow device, which requires
a prediction of the number of writes would go into the origin volume
during the snapshot life cycle. It is difficult to get this prediction
right, as very high value reduces utilization and low value increases
the chances of snapshot becoming full.
b) A new cow device needs to be created every time.
This really gets messy and creates a management problem once many
snapshots of a given origin are created, and gets worse with multiple
origins.
I am thinking, having a single device that would hold the cow blocks of
any number of snapshots of a given origin (or more) would help solve
this issue (Apart from this, having a single device helps share the cow
blocks among snapshots very effectively in a variety of scenarios).
But, it does require that LVM and EVMS be changed to suit this model and
also makes the snapshot target quiet complex.
I would like to receive some comments about what users, developers and
others think about this.
Thanks,
Vijai
P.S:- BTW, apologizes for cross posting.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
______________________________________________________________________
This e-mail has been scanned by Verizon Managed Email Content Service,
using Skeptic(tm) technology powered by MessageLabs. For more
information on Verizon Managed Email Content Service, visit
http://www.verizonbusiness.com.
______________________________________________________________________
______________________________________________________________________
This e-mail has been scanned by Verizon Managed Email Content Service, using Skeptic technology powered by MessageLabs. For more information on Verizon Managed Email Content Service, visit http://www.verizonbusiness.com.
______________________________________________________________________
^ permalink raw reply [flat|nested] 10+ messages in thread
* [linux-lvm] RE: [dm-devel] [RFC] Multiple Snapshots - Manageability problem
@ 2007-01-11 21:34 ` Wilson, Christopher J
0 siblings, 0 replies; 10+ messages in thread
From: Wilson, Christopher J @ 2007-01-11 21:34 UTC (permalink / raw)
To: device-mapper development, evms-devel, linux-lvm
I haven't read through all of these options yet (but I will). I will
say that synthesizing all your cow objects into one pool will be
difficult. You're going to have issues with garbage collection of old
copies and may have to build in some scavenge or compress functions
which will take system resources. From my experience with disk based
de-duplication technologies you're heading down a hole which can be a
dark place. There are performance issues and maintaining all those
pointers is problematic. The virtual pool sounds good, and works very
will for primary storage functions (3PAR) but in practice for backup
applications with virtual pools for deduplication it's not been so hot.
I'm not clear what the issue is with maintaining multiple cow snapshots.
Just exactly how many are users asking for? Keeping more than a few cow
snaps online is not using the function for what it was meant for. COW
technology is for immediate rollback (to me) and not for long term
backup images. Sizing is an issue that will not go away and is not
resolvable in any low level OS code, this is a business/user issue.
Most customers don't even know how much data they're going to have much
less what their average write rates are, and I don't envision a cow pool
as solving the sizing issue.
If I had my way I'd rather see energy put into cow technology for use as
a disk cache for backup applications and tighter integration with those
apps. Better still would be for interfaces from business level
applications (Oracle, MySQL, etc) to quiece IO, flush buffers, and take
a consistent copy of the application, state and all. Putting together
an application level copy on hardware, being able to move that through a
tighter workflow to backup media through a common API would be my
preference instead of having each user create their own individual
"glue" code. If you look into SNIA's SMI-S (Storage Management API)
copy services package there may already be a template for this. I'd say
at least that supporting SMI-S Copy Services through that API is
desirable because a lot of the SRM application today are on their way to
leveraging that code.
Christopher Wilson
Storage Architect
Verizon Business
IT Solutions - IP Application Hosting
240 264 4136
vnet: 364 4136
-----Original Message-----
From: dm-devel-bounces@redhat.com [mailto:dm-devel-bounces@redhat.com]
On Behalf Of Vijai Babu Madhavan
Sent: Thursday, January 11, 2007 1:18 PM
To: evms-devel@lists.sourceforge.net; dm-devel@redhat.com;
linux-lvm@redhat.com
Subject: [dm-devel] [RFC] Multiple Snapshots - Manageability problem
Hi,
The problem of DM snapshots with multiple snapshots have been discussed
in the lists quiet a bit (Most recently @
https://www.redhat.com/archives/dm-devel/2006-October/msg00034.html).
We are currently in the process of building a DM snapshot target that
scales well with many snapshots (so that the changed blocks don't get
copied to each snapshot). In this process, I would also like to validate
an assumption.
Today, when a single snapshot gets created, a new cow device of a given
size is also created. IMO, there are two problems with this approach:
a) It is difficult to predict the size of the cow device, which requires
a prediction of the number of writes would go into the origin volume
during the snapshot life cycle. It is difficult to get this prediction
right, as very high value reduces utilization and low value increases
the chances of snapshot becoming full.
b) A new cow device needs to be created every time.
This really gets messy and creates a management problem once many
snapshots of a given origin are created, and gets worse with multiple
origins.
I am thinking, having a single device that would hold the cow blocks of
any number of snapshots of a given origin (or more) would help solve
this issue (Apart from this, having a single device helps share the cow
blocks among snapshots very effectively in a variety of scenarios).
But, it does require that LVM and EVMS be changed to suit this model and
also makes the snapshot target quiet complex.
I would like to receive some comments about what users, developers and
others think about this.
Thanks,
Vijai
P.S:- BTW, apologizes for cross posting.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
______________________________________________________________________
This e-mail has been scanned by Verizon Managed Email Content Service,
using Skeptic(tm) technology powered by MessageLabs. For more
information on Verizon Managed Email Content Service, visit
http://www.verizonbusiness.com.
______________________________________________________________________
______________________________________________________________________
This e-mail has been scanned by Verizon Managed Email Content Service, using Skeptic� technology powered by MessageLabs. For more information on Verizon Managed Email Content Service, visit http://www.verizonbusiness.com.
______________________________________________________________________
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: [RFC] Multiple Snapshots - Manageability problem
2007-01-11 21:34 ` [linux-lvm] RE: [dm-devel] " Wilson, Christopher J
@ 2007-01-12 4:46 ` Vijai Babu Madhavan
-1 siblings, 0 replies; 10+ messages in thread
From: Vijai Babu Madhavan @ 2007-01-12 4:46 UTC (permalink / raw)
To: evms-devel, device-mapper development, linux-lvm
Hi Chris,
Thanks for the response. I am trying to keep my mails short as I
believe the lack of responses to my mails are probably due to the
fact that they are long, but its kinda difficult to keep them small
and still convey the various aspects. :)
>>> On 1/12/2007 at 3:04 AM, "Wilson, Christopher J"
<chris.j.wilson@verizonbusiness.com> wrote:
> I haven't read through all of these options yet (but I will). I will
> say that synthesizing all your cow objects into one pool will be
> difficult. You're going to have issues with garbage collection of old
> copies and may have to build in some scavenge or compress functions
> which will take system resources. From my experience with disk based
> de-duplication technologies you're heading down a hole which can be a
> dark place. There are performance issues and maintaining all those
> pointers is problematic. The virtual pool sounds good, and works very
> will for primary storage functions (3PAR) but in practice for backup
> applications with virtual pools for deduplication it's not been so hot.
I completely agree that its not going to be easy. But, I guess some price
needs to be paid to get the benefits. If snapshots could be implemented
at the file system level, we do not necessarily need to redo lot of these,
but building snapshot functionality into the file system itself comes with
the obvious drawback. If only we could build some framework at the
file system layer, but some thing that is not tied to each file system would
be good. I have not had a chance to spend time in this space yet, do others
have any ideas in this space?
> I'm not clear what the issue is with maintaining multiple cow snapshots.
> Just exactly how many are users asking for? Keeping more than a few cow
> snaps online is not using the function for what it was meant for. COW
> technology is for immediate rollback (to me) and not for long term
> backup images.
From what we see from the users/IT admins, I see two common uses of
snapshots.
a) Snapshots for backups
b) Snapshots as backups
In the first case, snapshots are obtained to avoid the open file errors, etc
and keeping few snapshots online is more than sufficient.
But, increasingly, we see lot of admins trying to deploy D2D2T
(Disk->Disk->Tape), to avoid the many problems associated with the tape
backups. And, Snapshots are one of the very efficient way of keeping the
disk backups to protect against logical failures (of course not for hardware
failures).
Hence, the second case is becoming a strong use-case, as admins want to
take 3-4 snapshots a day and recycle them after a week or two weeks.
Based on the frequency and the time a snapshot is kept alive, number of
snapshots easily get into double digit, in some cases, triple digit.
With the current DM snapshot code, with couple of snapshots, the system
comes down rapidly (The throughput numbers in the earlier mail thread and
the complaints from users reported in the list indicate this).
As we fix this multiple snapshots issue, it also makes sense to fix the multiple
snapshots management issue using a single cow device. Besides, using a single
cow device provides a very compelling efficient way to share the blocks among
snapshots. This also enables the snapshots to be managed independently.
> Sizing is an issue that will not go away and is not
> resolvable in any low level OS code, this is a business/user issue.
> Most customers don't even know how much data they're going to have much
> less what their average write rates are, and I don't envision a cow pool
> as solving the sizing issue.
I totally agree. I guess most admins today are loading their servers around 60-70%
utilization to avoid these space issues. While this works ok for primary servers,
it is impractical to waste so much space in each snapshot, especially with multiple
snapshots. I think having a single cow device for each (origin), preferably multiple
origins sharing a single cow device would help alleviate this.
> If I had my way I'd rather see energy put into cow technology for use as
> a disk cache for backup applications and tighter integration with those
> apps. Better still would be for interfaces from business level
> applications (Oracle, MySQL, etc) to quiece IO, flush buffers, and take
> a consistent copy of the application, state and all. Putting together
> an application level copy on hardware, being able to move that through a
> tighter workflow to backup media through a common API would be my
> preference instead of having each user create their own individual
> "glue" code. If you look into SNIA's SMI-S (Storage Management API)
> copy services package there may already be a template for this. I'd say
> at least that supporting SMI-S Copy Services through that API is
> desirable because a lot of the SRM application today are on their way to
> leveraging that code.
I completely agree. Application co-ordinated snapshot facility is really
important and would really help lot of application developers and admins.
It is going to be interesting and challenging to build a framework that would
satisfy diverse application needs. At Novell, we also have some interest in this
space, and we are going through some internal processes and I believe we would
come out some time soon.
Vijai
^ permalink raw reply [flat|nested] 10+ messages in thread
* [linux-lvm] RE: [dm-devel] [RFC] Multiple Snapshots - Manageability problem
@ 2007-01-12 4:46 ` Vijai Babu Madhavan
0 siblings, 0 replies; 10+ messages in thread
From: Vijai Babu Madhavan @ 2007-01-12 4:46 UTC (permalink / raw)
To: evms-devel, device-mapper development, linux-lvm
Hi Chris,
Thanks for the response. I am trying to keep my mails short as I
believe the lack of responses to my mails are probably due to the
fact that they are long, but its kinda difficult to keep them small
and still convey the various aspects. :)
>>> On 1/12/2007 at 3:04 AM, "Wilson, Christopher J"
<chris.j.wilson@verizonbusiness.com> wrote:
> I haven't read through all of these options yet (but I will). I will
> say that synthesizing all your cow objects into one pool will be
> difficult. You're going to have issues with garbage collection of old
> copies and may have to build in some scavenge or compress functions
> which will take system resources. From my experience with disk based
> de-duplication technologies you're heading down a hole which can be a
> dark place. There are performance issues and maintaining all those
> pointers is problematic. The virtual pool sounds good, and works very
> will for primary storage functions (3PAR) but in practice for backup
> applications with virtual pools for deduplication it's not been so hot.
I completely agree that its not going to be easy. But, I guess some price
needs to be paid to get the benefits. If snapshots could be implemented
at the file system level, we do not necessarily need to redo lot of these,
but building snapshot functionality into the file system itself comes with
the obvious drawback. If only we could build some framework at the
file system layer, but some thing that is not tied to each file system would
be good. I have not had a chance to spend time in this space yet, do others
have any ideas in this space?
> I'm not clear what the issue is with maintaining multiple cow snapshots.
> Just exactly how many are users asking for? Keeping more than a few cow
> snaps online is not using the function for what it was meant for. COW
> technology is for immediate rollback (to me) and not for long term
> backup images.
From what we see from the users/IT admins, I see two common uses of
snapshots.
a) Snapshots for backups
b) Snapshots as backups
In the first case, snapshots are obtained to avoid the open file errors, etc
and keeping few snapshots online is more than sufficient.
But, increasingly, we see lot of admins trying to deploy D2D2T
(Disk->Disk->Tape), to avoid the many problems associated with the tape
backups. And, Snapshots are one of the very efficient way of keeping the
disk backups to protect against logical failures (of course not for hardware
failures).
Hence, the second case is becoming a strong use-case, as admins want to
take 3-4 snapshots a day and recycle them after a week or two weeks.
Based on the frequency and the time a snapshot is kept alive, number of
snapshots easily get into double digit, in some cases, triple digit.
With the current DM snapshot code, with couple of snapshots, the system
comes down rapidly (The throughput numbers in the earlier mail thread and
the complaints from users reported in the list indicate this).
As we fix this multiple snapshots issue, it also makes sense to fix the multiple
snapshots management issue using a single cow device. Besides, using a single
cow device provides a very compelling efficient way to share the blocks among
snapshots. This also enables the snapshots to be managed independently.
> Sizing is an issue that will not go away and is not
> resolvable in any low level OS code, this is a business/user issue.
> Most customers don't even know how much data they're going to have much
> less what their average write rates are, and I don't envision a cow pool
> as solving the sizing issue.
I totally agree. I guess most admins today are loading their servers around 60-70%
utilization to avoid these space issues. While this works ok for primary servers,
it is impractical to waste so much space in each snapshot, especially with multiple
snapshots. I think having a single cow device for each (origin), preferably multiple
origins sharing a single cow device would help alleviate this.
> If I had my way I'd rather see energy put into cow technology for use as
> a disk cache for backup applications and tighter integration with those
> apps. Better still would be for interfaces from business level
> applications (Oracle, MySQL, etc) to quiece IO, flush buffers, and take
> a consistent copy of the application, state and all. Putting together
> an application level copy on hardware, being able to move that through a
> tighter workflow to backup media through a common API would be my
> preference instead of having each user create their own individual
> "glue" code. If you look into SNIA's SMI-S (Storage Management API)
> copy services package there may already be a template for this. I'd say
> at least that supporting SMI-S Copy Services through that API is
> desirable because a lot of the SRM application today are on their way to
> leveraging that code.
I completely agree. Application co-ordinated snapshot facility is really
important and would really help lot of application developers and admins.
It is going to be interesting and challenging to build a framework that would
satisfy diverse application needs. At Novell, we also have some interest in this
space, and we are going through some internal processes and I believe we would
come out some time soon.
Vijai
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Multiple Snapshots - Manageability problem
2007-01-11 18:18 ` [linux-lvm] " Vijai Babu Madhavan
(?)
(?)
@ 2007-01-12 21:59 ` Benjamin Marzinski
2007-01-30 4:39 ` Mike Snitzer
-1 siblings, 1 reply; 10+ messages in thread
From: Benjamin Marzinski @ 2007-01-12 21:59 UTC (permalink / raw)
To: device-mapper development
On Thu, Jan 11, 2007 at 11:18:13AM -0700, Vijai Babu Madhavan wrote:
> Hi,
>
> The problem of DM snapshots with multiple snapshots have been discussed
> in the lists quiet a bit (Most recently @
> https://www.redhat.com/archives/dm-devel/2006-October/msg00034.html).
>
> We are currently in the process of building a DM snapshot target that scales
> well with many snapshots (so that the changed blocks don't get copied to each
> snapshot). In this process, I would also like to validate an assumption.
>
> Today, when a single snapshot gets created, a new cow device of a given size
> is also created. IMO, there are two problems with this approach:
>
> a) It is difficult to predict the size of the cow device, which requires a prediction
> of the number of writes would go into the origin volume during the snapshot
> life cycle. It is difficult to get this prediction right, as very high value reduces
> utilization and low value increases the chances of snapshot becoming full.
>
> b) A new cow device needs to be created every time.
>
> This really gets messy and creates a management problem once many
> snapshots of a given origin are created, and gets worse with multiple origins.
>
> I am thinking, having a single device that would hold the cow blocks of any
> number of snapshots of a given origin (or more) would help solve this issue
> (Apart from this, having a single device helps share the cow blocks among
> snapshots very effectively in a variety of scenarios).
>
> But, it does require that LVM and EVMS be changed to suit this model and
> also makes the snapshot target quiet complex.
>
> I would like to receive some comments about what users, developers
> and others think about this.
>
Have you taken a look at Daniel Phillips cluster snapshot work?
http://sources.redhat.com/cluster/csnap/index.html
The code is not complete, and am not sure if Daniel is doing any work on it at
all, but it has a nice design to store the cow data, and that URL contains the
design documents. In brief:
There is one device that stores all cow data (the snapstore). It has three
main parts, an allocation bitmap, a superblock that stores metadata, and
an exception btree. The exception btree is indexed by the location of the data
on the origin. For each chuck on the origin device that has cow data for one or
more snapshots, there is an exception in the btree that lists the location of
the cow data on the snapstore device, and the snapshots which are using that
exception. This list of snapshots is stored as a bitmask.
This means that no matter now many snapshots you have, all you need to do to
write to the origin is check the btree.
1. If every snapshot has an exception at that location, you're free to write.
And you can put that location in a cache, so you never need to check the btree
again until a new snapshot is created.
2. If there are snapshots that don't have an exception in the btree, you
allocate space on the disk, copy the data from the origin, and add an exception
to the btree, with a bitmask containing every snapshot that doesn't already
have an exception. You can then cache this location, so you don't have to
check the btree again until a new snapshot is created.
This saves both space and time over the existing implementation. Daniel's
code has a lot of stuff that is related to making the device clustered, which
you can ignore for the single machine case. But it is very nice to have a design
that is easily clusterable, so that switching between a single machine and
clustered snapshot can be done by simply flipping some bits instead of having to
convert between different ondisk formats.
-Ben
> Thanks,
> Vijai
> P.S:- BTW, apologizes for cross posting.
>
>
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Multiple Snapshots - Manageability problem
2007-01-12 21:59 ` Benjamin Marzinski
@ 2007-01-30 4:39 ` Mike Snitzer
2007-03-04 15:55 ` Dan Kegel
0 siblings, 1 reply; 10+ messages in thread
From: Mike Snitzer @ 2007-01-30 4:39 UTC (permalink / raw)
To: device-mapper development
On 1/12/07, Benjamin Marzinski <bmarzins@redhat.com> wrote:
> On Thu, Jan 11, 2007 at 11:18:13AM -0700, Vijai Babu Madhavan wrote:
> > I would like to receive some comments about what users, developers
> > and others think about this.
> >
>
> Have you taken a look at Daniel Phillips cluster snapshot work?
>
> http://sources.redhat.com/cluster/csnap/index.html
>
> The code is not complete, and am not sure if Daniel is doing any work on it at
> all, but it has a nice design to store the cow data, and that URL contains the
> design documents. In brief:
>
> There is one device that stores all cow data (the snapstore). It has three
> main parts, an allocation bitmap, a superblock that stores metadata, and
> an exception btree. The exception btree is indexed by the location of the data
> on the origin. For each chuck on the origin device that has cow data for one or
> more snapshots, there is an exception in the btree that lists the location of
> the cow data on the snapstore device, and the snapshots which are using that
> exception. This list of snapshots is stored as a bitmask.
>
> This means that no matter now many snapshots you have, all you need to do to
> write to the origin is check the btree.
Daniel's work has become an integral part of the hotcakes project:
http://code.google.com/p/hotcakes/
There was/is talk of making the dm targets used for ddsnap work
locally as a replacement for dm-snapshot. It might be wise to catch
up with the hotcakes people to see if you could leverage it as the
basis for a dm-snapshot++
Mike
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] Multiple Snapshots - Manageability problem
@ 2008-01-23 16:16 Tomasz Chmielewski
0 siblings, 0 replies; 10+ messages in thread
From: Tomasz Chmielewski @ 2008-01-23 16:16 UTC (permalink / raw)
To: linux-lvm, mvijai, dm-devel
Vijai Babu Madhavan, Thu, 11 Jan 2007 11:18:13 -0700, wrote:
> The problem of DM snapshots with multiple snapshots have been discussed
> in the lists quiet a bit (Most recently @
> https://www.redhat.com/archives/dm-devel/2006-October/msg00034.html).
>
> We are currently in the process of building a DM snapshot target that scales
> well with many snapshots (so that the changed blocks don't get copied to each
> snapshot). In this process, I would also like to validate an assumption.
>
> Today, when a single snapshot gets created, a new cow device of a given size
> is also created. IMO, there are two problems with this approach:
>
> a) It is difficult to predict the size of the cow device, which requires a prediction
> of the number of writes would go into the origin volume during the snapshot
> life cycle. It is difficult to get this prediction right, as very high value reduces
> utilization and low value increases the chances of snapshot becoming full.
>
> b) A new cow device needs to be created every time.
Hi,
Any news on that?
Still, with multiple snapshots write performance degrades linearly - is
any work done to change that anytime soon?
--
Tomasz Chmielewski
http://wpkg.org
_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-01-23 16:16 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-11 18:18 [RFC] Multiple Snapshots - Manageability problem Vijai Babu Madhavan
2007-01-11 18:18 ` [linux-lvm] " Vijai Babu Madhavan
2007-01-11 21:34 ` Wilson, Christopher J
2007-01-11 21:34 ` [linux-lvm] RE: [dm-devel] " Wilson, Christopher J
2007-01-12 4:46 ` Vijai Babu Madhavan
2007-01-12 4:46 ` [linux-lvm] RE: [dm-devel] " Vijai Babu Madhavan
2007-01-12 21:59 ` Benjamin Marzinski
2007-01-30 4:39 ` Mike Snitzer
2007-03-04 15:55 ` Dan Kegel
-- strict thread matches above, loose matches on Subject: below --
2008-01-23 16:16 Tomasz Chmielewski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.