* [uml-devel] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication
@ 2004-01-17 21:13 Stephen D. Williams
2004-01-18 4:45 ` [uml-devel] Re: [uml-user] " Jeff Dike
2004-01-18 11:04 ` [uml-devel] " BlaisorBlade
0 siblings, 2 replies; 8+ messages in thread
From: Stephen D. Williams @ 2004-01-17 21:13 UTC (permalink / raw)
To: user-mode-linux-devel, user-mode-linux-user
[-- Attachment #1: Type: text/plain, Size: 1423 bytes --]
When a UML instance is running a always-on service, such as a web
server, an administrator needs to be able to make live backups, or
replications, of a running system.
This can be done using LVM snapshots, although that is not always
appropriate for just this feature. A UML instance can be paused and then
restarted, but this can cause severe delays while large partitions are
replicated.
The COW ability is a great basis for an ideal solution, but I believe we
need to identify and implement some additional features.
What I propose as a useful solution is:
A UML instance mounts filesystems directly or based on a COW image.
When an administrator invokes a console snapshot mode, the UML instance
causes a quick freeze, new delta COWs to be created and stacked on
existing mounts, then resumes.
When the administrator completes whatever snapshot backup is needed,
they invoke a console unsnapshot command which pauses the instance,
merges the delta COWs, remounts the original images with updates, and
resumes.
This relies on COW stacking, which I saw was added in a patch last year
and I assume is still present.
The downtime for the instance would be measured in seconds generally.
Can this be done now? What needs to be added to support it?
sdw
--
swilliams@hpti.com http://www.hpti.com Personal: sdw@lig.net http://sdw.st
Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw
[-- Attachment #2: sdw.vcf --]
[-- Type: text/x-vcard, Size: 234 bytes --]
begin:vcard
fn:Stephen Williams
n:Williams;Stephen
email;internet:sdw@lig.net
tel;work:703-724-0118
tel;fax:703-995-0407
tel;pager:sdwpage@lig.net
tel;home:703-729-5405
tel;cell:703-371-9362
x-mozilla-html:TRUE
version:2.1
end:vcard
^ permalink raw reply [flat|nested] 8+ messages in thread
* [uml-devel] Re: [uml-user] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication
2004-01-17 21:13 [uml-devel] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication Stephen D. Williams
@ 2004-01-18 4:45 ` Jeff Dike
2004-01-18 7:07 ` Stephen D. Williams
2004-01-18 11:04 ` [uml-devel] " BlaisorBlade
1 sibling, 1 reply; 8+ messages in thread
From: Jeff Dike @ 2004-01-18 4:45 UTC (permalink / raw)
To: Stephen D. Williams; +Cc: user-mode-linux-devel, user-mode-linux-user
On Sat, Jan 17, 2004 at 04:13:39PM -0500, Stephen D. Williams wrote:
> When a UML instance is running a always-on service, such as a web
> server, an administrator needs to be able to make live backups, or
> replications, of a running system.
>
> Can this be done now? What needs to be added to support it?
Have you seen the stop, sysrq s, cp, go trick described at
http://user-mode-linux.sourceforge.net/mconsole.html ?
Jeff
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* [uml-devel] Re: [uml-user] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication
2004-01-18 4:45 ` [uml-devel] Re: [uml-user] " Jeff Dike
@ 2004-01-18 7:07 ` Stephen D. Williams
2004-01-18 16:23 ` s-uml
0 siblings, 1 reply; 8+ messages in thread
From: Stephen D. Williams @ 2004-01-18 7:07 UTC (permalink / raw)
To: Jeff Dike; +Cc: user-mode-linux-devel, user-mode-linux-user
[-- Attachment #1: Type: text/plain, Size: 2003 bytes --]
Yes, but that doesn't meet the key requirement I am proposing: that the
downtime be limited to a few seconds. The only potential significant
downtime with what I am proposing is merging COWs if there was a lot of
disk I/O during the 'snapshot mode'. The existing ability is certainly
useful, but not sufficient to get backups with the least impact to a
running system. Snapshot mode is suspend and flush/sync with the
additional semantics of an automatic temporary push/pop of a COW layer
on all filesystem images. The remaining issue of merging COW's could be
handled by RAID-recovery like gradual merging.
The problem is that if you have gigabytes of filesystem images, it takes
time to copy them, even if using something like rsync to determine what
has changed. It also costs, unless there is a program that determines
which blocks are actually utilized without reads, to scan zero holes in
sparse files.
I should have also mentioned that it should be possible to get a
reliable feed of what blocks, or ranges more likely have changed since a
certain event. This would allow very efficient, near-realtime
replication. At the very least this could be used in suspend or
'snapshot mode' for efficiency, but with proper push of blocks, buffer
visibility, or some kind of write-through notification, synchronization
could be realtime.
sdw
Jeff Dike wrote:
>On Sat, Jan 17, 2004 at 04:13:39PM -0500, Stephen D. Williams wrote:
>
>
>>When a UML instance is running a always-on service, such as a web
>>server, an administrator needs to be able to make live backups, or
>>replications, of a running system.
>>
>>Can this be done now? What needs to be added to support it?
>>
>>
>
>Have you seen the stop, sysrq s, cp, go trick described at
>http://user-mode-linux.sourceforge.net/mconsole.html ?
>
> Jeff
>
>
--
swilliams@hpti.com http://www.hpti.com Personal: sdw@lig.net http://sdw.st
Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw
[-- Attachment #2: sdw.vcf --]
[-- Type: text/x-vcard, Size: 234 bytes --]
begin:vcard
fn:Stephen Williams
n:Williams;Stephen
email;internet:sdw@lig.net
tel;work:703-724-0118
tel;fax:703-995-0407
tel;pager:sdwpage@lig.net
tel;home:703-729-5405
tel;cell:703-371-9362
x-mozilla-html:TRUE
version:2.1
end:vcard
^ permalink raw reply [flat|nested] 8+ messages in thread
* [uml-devel] Re: [uml-user] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication
2004-01-18 7:07 ` Stephen D. Williams
@ 2004-01-18 16:23 ` s-uml
0 siblings, 0 replies; 8+ messages in thread
From: s-uml @ 2004-01-18 16:23 UTC (permalink / raw)
To: Stephen D. Williams
Cc: Jeff Dike, user-mode-linux-devel, user-mode-linux-user
On Sun, Jan 18, 2004 at 02:07:56AM -0500, Stephen D. Williams wrote:
> Yes, but that doesn't meet the key requirement I am proposing: that the
> downtime be limited to a few seconds.
If you use the LVM on the host you can take a snapshot of the volume the ubd
images are on while the uml is stopped. The snapshot gets made instantly so
the uml would be stopped for very little time.
-Steve
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [uml-devel] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication
2004-01-17 21:13 [uml-devel] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication Stephen D. Williams
2004-01-18 4:45 ` [uml-devel] Re: [uml-user] " Jeff Dike
@ 2004-01-18 11:04 ` BlaisorBlade
1 sibling, 0 replies; 8+ messages in thread
From: BlaisorBlade @ 2004-01-18 11:04 UTC (permalink / raw)
To: user-mode-linux-devel
Alle 22:13, sabato 17 gennaio 2004, Stephen D. Williams ha scritto:
> When a UML instance is running a always-on service, such as a web
> server, an administrator needs to be able to make live backups, or
> replications, of a running system.
>
> This can be done using LVM snapshots, although that is not always
> appropriate for just this feature. A UML instance can be paused and then
> restarted, but this can cause severe delays while large partitions are
> replicated.
In general, this should be done with LVM. There is a strong idea that COW is
somehow a duplication of LVM, and I read that Jeff Dike (IIRC) wanted it to
be rewritten as a new snapshot format for LVM (the reason for a different
format is that LVM snapshot were built thinking to partitions, while Uml can
use sparse files - which is a great advantage if you're able to use it).
It's maybe harder to setup, that's true, but see about EVMS which provides a
nicer interface.
Bye
--
cat <<EOSIGN
Paolo Giarrusso, aka Blaisorblade
Linux Kernel 2.4.23/2.6.0 on an i686; Linux registered user n. 292729
EOSIGN
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [uml-devel] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication
@ 2004-01-18 8:20 James W McMechan
2004-01-18 16:13 ` BlaisorBlade
0 siblings, 1 reply; 8+ messages in thread
From: James W McMechan @ 2004-01-18 8:20 UTC (permalink / raw)
To: sdw, user-mode-linux-user, user-mode-linux-devel
The stackable COW was not merged,
I have a new more abstracted version I
am working on, but Jeff seemed to want
to replace the LVM driver layer above
the ubd device to implement the COW
features rather then having COW as a
system feature below the ubd layer where
I need it, for it to work like a hardware
function. The LVM system can do its
own snapshot feature above the ubd
driver.
That puts LVM as the user of the UML
setting up a snapshot, and COW being
setup by the host admin completely
outside of the UML invisible (mostly)
to the admin of the UML (guest root)
Which provides two separate but
useful capabilities to the host admin
and the guest admin
Mostly I have been tinkering with the
ISAM style COW so that filesystems
that do not do sparse files will work
better, but there are a lot of corner
cases to look at. So it is a bit slow.
The Dynamic remount could be done
in much less than one second, if you
prepare as a separate step, then
it is just replacing the file handle in the
device structure with a new handle
all writes before will occur to the
original COW and afterwards all
writes will occur to the new COW
since that is just a single pointer
write, it does not have too many
problems, only closing while running
the snapshot is likely to break things
and once the new COW is in place
the lower layer COWs can be merged
since they are then readonly, if it is
mounted r/w on more than one device
things would get messy, but that is bad
anyway I don't consider it a major
bug, but rather a user problem.
I have been thinking of some more
ubd flags ubd=C10H128S63 to
change disk layout or ubd1C10
for the ubd device 1 cylinder 10
case as a individual example.
C for cylinder
H for heads
S for sectors
P for padding i.e. 512 byte header
padding so that raw devices can be
checked without failing due to I/O
errors dropping out.
M for mmap size index,data
V to select what version header to create
V0 is a raw data file don't check for COW
V1 the first version header not portable
V2 the second one with the wrong math
V3 my version with the separated offset length
separated allows for using a program to fix the
math errors on detection without having to
redo the header format
V4 my version of ISAM
L for symlinks as names
U for update in place
which only really applies to the moo
program, but I was thinking should
have all the options defined.
?? sector size should have a option
?? for page size i.e. use 64k even on i386
so COW can be read on say a alpha
A to set AIO mode
R for readonly so all options are uppercase
S for sync data at each read/write
?? for sync on barriers for 2.6
The types of mmaping I think might be needed
no mmap in use -- for when mmap does not
work some fs do not mmap well
index mmap -- like now only the index/bitmap
is mmaped uses a fair amount of address space
full mmap -- map in both the index/bitmap and
the data, uses a huge amount of address space
paged index -- map in a (few) pages of the
index at a time, I have run with one page nicely
paged data -- map in a (few) pages of the data
at a time, I am not sure that this is helpful?
The problem I see with mmaping is that in order
to mmap properly I need to kmalloc a buffer
of the right size first, and then mmap on top of
the buffer, so that the kernel does not try to
use the space that is mmaped for other purposes
outside of the ubd_user/cow_user functions
which it would be free to do if it is not kmalloc'ed
and then if it tries to read/write that address space
the _user function would be using the mmaped
area not the kernel memory, or when the kernel
overwrites the index/bitmap with kernel data
Ick the mind boggles
The types of headers I think should be present
COW -- what we have now
ISAM -- which will work without sparse files
DISK -- just to keep disk image info CHS etc
HEADER -- like disk but treats the data in
the disk as a separate file sort of like the
backing file in a COW setup
several different HEADERs could be setup
for one image file, each with a different layout
for the C/H/S as a example
The problem I see with the ubd=mmap is that
it does not have a failure path for a
non-page-aligned data sector, it looks like it
just drops it, this would I think occur mostly
on metadata updates, 2.4.23-1um was where
I was looking, and I could be wrong, but if it
does I would expect massive data corruption
My infrastructure currently uses mmap but I
hope to make that optional since it gobbles
address space, ISAM uses 8 bytes/sector as
index entries and so gobbles 64 times as much
as regular COW files, and I have problems
with the regular COW now, if I get the paged
version it will be much better.
The only other calls it uses are
ubd_malloc/ubd_free, (kmalloc/kfree)
open, close, pread, pwrite
all of which are being abstracted through
common_open, common_close
common_pread, common_pwrite
so that ubd_user gets much nicer and
ubd_kern stops having bits of the COW
layer stuck in the device structure.
Then later the entire section can be replaced
if desired with a different implementation so
long as it provides open/close/pread/pwrite
or something similar.
Oh yes as a side note 2.6.0 and 2.4.25pre
have the fix for Oops on the host kernel
when reading /dev/shm with hostfs so if
that was bothering you the 2.6 or next
2.4 kernel should help.
James McMechan
________________________________________________________________
The best thing to hit the internet in years - Juno SpeedBand!
Surf the web up to FIVE TIMES FASTER!
Only $14.95/ month - visit www.juno.com to sign up today!
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [uml-devel] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication
2004-01-18 8:20 James W McMechan
@ 2004-01-18 16:13 ` BlaisorBlade
2004-01-19 3:59 ` Matt Zimmerman
0 siblings, 1 reply; 8+ messages in thread
From: BlaisorBlade @ 2004-01-18 16:13 UTC (permalink / raw)
To: James W McMechan; +Cc: user-mode-linux-user, user-mode-linux-devel
Alle 09:20, domenica 18 gennaio 2004, James W McMechan ha scritto:
> The types of mmaping I think might be needed:
> no mmap in use -- for when mmap does not
> work some fs do not mmap well
> index mmap -- like now only the index/bitmap
> is mmaped uses a fair amount of address space
> full mmap -- map in both the index/bitmap and
> the data, uses a huge amount of address space
> paged index -- map in a (few) pages of the
> index at a time, I have run with one page nicely
> paged data -- map in a (few) pages of the data
> at a time, I am not sure that this is helpful?
If I've correctly understood what Jeff explains, this would be very useful to
reduce memory consumption.
> The problem I see with mmaping is that in order
> to mmap properly I need to kmalloc a buffer
> of the right size first, and then mmap on top of
> the buffer, so that the kernel does not try to
> use the space that is mmaped for other purposes
> outside of the ubd_user/cow_user functions
> which it would be free to do if it is not kmalloc'ed
> and then if it tries to read/write that address space
> the _user function would be using the mmaped
> area not the kernel memory, or when the kernel
> overwrites the index/bitmap with kernel data
> Ick the mind boggles
Ok, I've checked and physmem_subst_mapping does exactly this (even if I don't
see a real reason for which kmalloc should reserve the correct space; maybe
this is the bug).
> The problem I see with the ubd=mmap is that
> it does not have a failure path for a
> non-page-aligned data sector, it looks like it
> just drops it, this would I think occur mostly
> on metadata updates, 2.4.23-1um was where
> I was looking, and I could be wrong, but if it
> does I would expect massive data corruption
In fact ubd=mmap has data corruption. If you have seen why, post *this only*
to Jeff Dike (he needed several reports to start thinking mmap was buggy). I
say *this only* because this mail was quite long and a bit hard to read.
Also, there is an user (I'm going to ask him whether he uses ubd=mmap) which
reports some problems (I'm forwarding his message to you).
However, I've seen the missing failure path, but comparing with 2.4.20-6um(
the latest patch I had available on my HD without ubd=mmap) it seems that
mmap is just not used. Also, mmap_fd is called by prepare_request, which does
nothing if alignment is wrong, while the actual write is anyhow done in
do_io. And I'm sure of this, since the return value of a failing mmap_fd is
the same as when ubd=mmap is not active (from mmap_fd):
if(!ubd_do_mmap)
return(-1);
/* The buffer must be page aligned */
if(((unsigned long) req->buffer % UBD_MMAP_BLOCK_SIZE) != 0)
return(-1);
> Oh yes as a side note 2.6.0 and 2.4.25pre
> have the fix for Oops on the host kernel
> when reading /dev/shm with hostfs so if
> that was bothering you the 2.6 or next
> 2.4 kernel should help.
Yes, I saw it. Thanks!
--
cat <<EOSIGN
Paolo Giarrusso, aka Blaisorblade
Linux Kernel 2.4.23/2.6.0 on an i686; Linux registered user n. 292729
EOSIGN
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [uml-devel] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication
2004-01-18 16:13 ` BlaisorBlade
@ 2004-01-19 3:59 ` Matt Zimmerman
0 siblings, 0 replies; 8+ messages in thread
From: Matt Zimmerman @ 2004-01-19 3:59 UTC (permalink / raw)
To: user-mode-linux-user, user-mode-linux-devel
On Sun, Jan 18, 2004 at 05:13:57PM +0100, BlaisorBlade wrote:
> Alle 09:20, domenica 18 gennaio 2004, James W McMechan ha scritto:
> > Oh yes as a side note 2.6.0 and 2.4.25pre
> > have the fix for Oops on the host kernel
> > when reading /dev/shm with hostfs so if
> > that was bothering you the 2.6 or next
> > 2.4 kernel should help.
>
> Yes, I saw it. Thanks!
Ooh, good. I stopped using hostfs-on-tmpfs for a number of things due to
this bug, having sent the oops to linux-kernel and having received no
response. I'm glad it's fixed.
--
- mdz
-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-01-19 3:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-17 21:13 [uml-devel] Dynamic remount with variable COW stacking/merging needed, support for snapshot repilication Stephen D. Williams
2004-01-18 4:45 ` [uml-devel] Re: [uml-user] " Jeff Dike
2004-01-18 7:07 ` Stephen D. Williams
2004-01-18 16:23 ` s-uml
2004-01-18 11:04 ` [uml-devel] " BlaisorBlade
-- strict thread matches above, loose matches on Subject: below --
2004-01-18 8:20 James W McMechan
2004-01-18 16:13 ` BlaisorBlade
2004-01-19 3:59 ` Matt Zimmerman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.