* [linux-lvm] What is a good stripe size?
@ 2001-06-16 13:29 Urs Thuermann
2001-06-17 18:44 ` idsfa
0 siblings, 1 reply; 12+ messages in thread
From: Urs Thuermann @ 2001-06-16 13:29 UTC (permalink / raw)
To: linux-lvm
I have two SCSI disks in my systems with PVs on /dev/sda2 and /dev/sdb2
(/dev/sd{a,b}1 are very small partitions for /boot and swap).
Using the PVs on sda2 and sdb2 I created a single VG and I want to
create striped LVs on it now. My questions is, how large should I
choose the stripe size to achive optimal performance. If I choose it
too large, I will probably loose the win of striping.
What happens if I choose very small stripe size, say 1K? When I then
need say 100K of contigous blocks of the LV LVM will read 50K
contigously from sda2 and 50K from sdb2. But will the 50K from each
of thes partitions be read in 50 reads of 1K? Will that degrade
performance?
Does the optimal value of the stripe size depend on how the LV will be
used and the average file size on it? My LVs will be /, /var,
/var/spool/news, /usr, /usr/local, /home and /tftpboot/galois.
urs
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
2001-06-16 13:29 Urs Thuermann
@ 2001-06-17 18:44 ` idsfa
0 siblings, 0 replies; 12+ messages in thread
From: idsfa @ 2001-06-17 18:44 UTC (permalink / raw)
To: linux-lvm
On Sat, Jun 16, 2001 at 03:29:16PM +0200, Urs Thuermann wrote:
> Using the PVs on sda2 and sdb2 I created a single VG and I want to
> create striped LVs on it now. My questions is, how large should I
> choose the stripe size to achive optimal performance. If I choose it
> too large, I will probably loose the win of striping.
You lose the advantage of striping if the stripe size is on the order
of the file size. You want stripes which will be narrower than most
of the files you will be using.
The ideal stripe size is a multiple of the size of the blocks in the
filesystem. Reads and writes work most efficiently when they can
run a block at a time, rather than having to buffer up fragments.
These two guidelines usually leave you with a stripe which is between
1-4x your block size.
If you do not know the size of the blocks in your filesystem, there
is usually a filesystem tool which will tell you (dumpe2fs for ext2,
debugreiserfs for reiser, etc etc).
Ex. My machine has only reiserfs. debugreiserfs tells me that the
filesystems use 4096 byte blocks. I therefore have 4096-byte
wide stripes. This also accomodates ext2 (block sizes can be
between 1024-4096) and most of the other available filesystems,
which use multiples of 1k. It is a little wide for those which
use 512 byte blocks, but I am unlikely to use such an obsolete
fs for anything speed critical.
--
$ fortune -m Kellen
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
[not found] <134933704.992803570667.JavaMail.root@boots>
@ 2001-06-17 21:36 ` Wolfgang Weisselberg
2001-06-17 23:13 ` Steven Lembark
2001-06-18 4:07 ` idsfa
0 siblings, 2 replies; 12+ messages in thread
From: Wolfgang Weisselberg @ 2001-06-17 21:36 UTC (permalink / raw)
To: linux-lvm
Hi, idsfa!
idsfa@visi.com wrote 38 lines:
> On Sat, Jun 16, 2001 at 03:29:16PM +0200, Urs Thuermann wrote:
> > Using the PVs on sda2 and sdb2 I created a single VG and I want to
> > create striped LVs on it now. My questions is, how large should I
> > choose the stripe size to achive optimal performance. If I choose it
> > too large, I will probably loose the win of striping.
> You lose the advantage of striping if the stripe size is on the order
> of the file size. You want stripes which will be narrower than most
> of the files you will be using.
I wonder if you are looking at a single file or general
throughput here.
For a single file you may gain reading speed (writing is less
critical as it is buffered); however with a stripe size below
file size you will need to move the heads of both (or even
more) disks, increasing latency[1], effectively slowing down
reads unless you have fairly large files.
Often with more than one disk and randomly distributed files
(e.g. through the use of stripes) random files are on different
disks, so you can (best case) read as many files at the same
time -- in parallel -- as you have disks. This is only true
for sufficiently large stripes, though. Else a single file
blocks many/all disks, forcing longer/extra seeks for the
other files requested in parallel.
In conclusion (IMHO):
- small stripes increase the latency even for small reads,
hurting throughput (and slowing the reads even when looking
at a single file).
- sufficiently large stripes allow both parallel reads of
small[2] files or accelerated reads (at the cost of
extra -- in this case insignificant[3] -- seek time) of
single large[2] files.
- Systems where the I/O is not the bottleneck, i.e. parallel
reads or accelerated reads of large files won't help much,
will not profit from stripes while still increasing[4] the
risk of data loss through HD/controller failure.
- With lvm you could pmove the most accessed blocks so that
they are spread over all disks, probably you could even
split those disks in 2 parts: the fast 'begin' (outer
edge) of the platter and most of the (slower) inner parts.
This would have much the same effect of stripes, but would
need more attention (you need to run a program, probably
even inactivating the LV) and probably have finer granularity.
[1] Your seek time rises -- on the average -- the more heads
are in use. With one head you get 1/3 full-seek time for
a random head and file locations. With two heads you need
to move both heads, chances are that one of them has a
longer way than the other.
If you look at the first head, the seek time will be the
same if the first head is further away --- which is more
likely the further it is away, i.e. the higher the seek
time already would be for a single disk system. If it
is close (which happens and is part of the 1/3 _average_
seek time) then it is quite likely that the other head is
further away -- thus the average seek time increases.
This is quite uncritical for large files, but with short
files the seek time is greater than the read time. But only
where the increased seek time is small against the read
time a reduced read time can do reduce overall time --
that is, only large files will be faster.
[2] small: read time << seek time, fits in one stripe
large: read time >> seek time, does certainly not fit in
one stripe
[3] read time >> seek time
[4] all disks are 'single point of failure'. Most file systems
do not like loosing spots all over the place. But then you
do backup religiously, test your backups and have recovery
plans in place, yes?
-Wolfgang
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
2001-06-17 21:36 ` [linux-lvm] What is a good stripe size? Wolfgang Weisselberg
@ 2001-06-17 23:13 ` Steven Lembark
2001-06-18 10:04 ` Heinz J. Mauelshagen
2001-06-18 4:07 ` idsfa
1 sibling, 1 reply; 12+ messages in thread
From: Steven Lembark @ 2001-06-17 23:13 UTC (permalink / raw)
To: linux-lvm
>> On Sat, Jun 16, 2001 at 03:29:16PM +0200, Urs Thuermann wrote:
>> > Using the PVs on sda2 and sdb2 I created a single VG and I want to
>> > create striped LVs on it now. My questions is, how large should I
>> > choose the stripe size to achive optimal performance. If I choose it
>> > too large, I will probably loose the win of striping.
>
>> You lose the advantage of striping if the stripe size is on the order
>> of the file size. You want stripes which will be narrower than most
>> of the files you will be using.
>
> I wonder if you are looking at a single file or general
> throughput here.
>
> For a single file you may gain reading speed (writing is less
> critical as it is buffered); however with a stripe size below
> file size you will need to move the heads of both (or even
> more) disks, increasing latency[1], effectively slowing down
> reads unless you have fairly large files.
Assumes that the stripe blocks are not adjacent on their
respective disk drives. The smaller stripe may have no
penalty if the logical blocks are adjacent on the disk.
One of the main resons for using LVM at all is to avoid
having to worry about any of this. If raw speed is a major
consideration then use hardware RAID5 w/ strip size ==
I/O block size (e.g., 4 disks w/ 1K chunk and 4K filesystem
block on linux). This avoids the "extra read" penalty and
gives nice, distributed reads.
> [1] Your seek time rises -- on the average -- the more heads
> are in use. With one head you get 1/3 full-seek time for
> a random head and file locations. With two heads you need
> to move both heads, chances are that one of them has a
> longer way than the other.
One advantage of striping is that the seek latency of one
drive can be used for data I/O on another drive. If the LVM
system does any sort of double-buffering then the striped
system can negate/reduce the seek time.
This also leaves out the issue of journaled file systems, which
may have data (or just meta-data) spread out all over the
place -- leaveing you with fragmented reads even in the case
of a small file.
Net result is that depending on CPU, bus, controller and
disk hardware and their interactins with the file systems
and particular type of I/O being performned the answer
becomes "It Depends" :-)
In 15 years the only method I've found that works consistently
is to try however many of the recommendations you can before
comitting to any one of them. Benchmark them under realistic
condidtions and one will usually be a bit better. At that point you
can 'reverse engineer' why your particular conditions match
that particular theory -- and probably learn a bit about how to
improve your system as a result.
> [4] all disks are 'single point of failure'. Most file systems
> do not like loosing spots all over the place. But then you
> do backup religiously, test your backups and have recovery
> plans in place, yes?
Ah, but it's so much more fun to fiture out how it all works at
3 in the morning with 20 users breathing fire down your back!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
2001-06-17 21:36 ` [linux-lvm] What is a good stripe size? Wolfgang Weisselberg
2001-06-17 23:13 ` Steven Lembark
@ 2001-06-18 4:07 ` idsfa
1 sibling, 0 replies; 12+ messages in thread
From: idsfa @ 2001-06-18 4:07 UTC (permalink / raw)
To: linux-lvm
On Sun, Jun 17, 2001 at 11:36:27PM +0200, Wolfgang Weisselberg wrote:
> For a single file you may gain reading speed (writing is less
> critical as it is buffered); however with a stripe size below
> file size you will need to move the heads of both (or even
> more) disks, increasing latency[1], effectively slowing down
> reads unless you have fairly large files.
True, but I was assuming 'typical' use, which means < 2G of small files
(everything having to do with the OS) with the majority of the hard
drive space being used for files in the megabyte+ range (multimedia,
databases and so forth). I can agree with your logical argument that
larger stripes are better for small files, but I would further argue
that striping a partition of small files is the wrong way to go. If
you are looking for a performance increase on reads for small files,
you want to look@RAID1 (mirroring) or RAID 5 (if you have enough
disk controllers) or a filesystem which optimises access to small
files.
{ I neglected to mention in my example that I do not stripe my OS
partitions, but mirror them instead. I stripe my /home (where I
do graphics and sound work) and /games. }
> In conclusion (IMHO):
> - small stripes increase the latency even for small reads,
> hurting throughput (and slowing the reads even when looking
> at a single file).
I'd have to change that to "increase the latency for small reads".
For files >> stripe size, you will see no increase in latency.
This was stated in your [1] endnote, as well.
> - sufficiently large stripes allow both parallel reads of
> small[2] files or accelerated reads (at the cost of
> extra -- in this case insignificant[3] -- seek time) of
> single large[2] files.
Not argued. I was assuming large files. The answer to "how do I
configure X" is almost always, "what do you want to do with it?"
> - With lvm you could pmove the most accessed blocks so that
> they are spread over all disks, probably you could even
> split those disks in 2 parts: the fast 'begin' (outer
> edge) of the platter and most of the (slower) inner parts.
> This would have much the same effect of stripes, but would
> need more attention (you need to run a program, probably
> even inactivating the LV) and probably have finer granularity.
Something which would balance PEs within a VG based on their usage
would be a lovely system tool to add to lvm. I can hardly wait for
your program ;-)
Thanks for pointing out where I was making unspoken assumptions!
--
$ fortune -m Kellen
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
2001-06-17 23:13 ` Steven Lembark
@ 2001-06-18 10:04 ` Heinz J. Mauelshagen
0 siblings, 0 replies; 12+ messages in thread
From: Heinz J. Mauelshagen @ 2001-06-18 10:04 UTC (permalink / raw)
To: linux-lvm
On Sun, Jun 17, 2001 at 06:13:51PM -0500, Steven Lembark wrote:
>
> >> On Sat, Jun 16, 2001 at 03:29:16PM +0200, Urs Thuermann wrote:
> >> > Using the PVs on sda2 and sdb2 I created a single VG and I want to
> >> > create striped LVs on it now. My questions is, how large should I
> >> > choose the stripe size to achive optimal performance. If I choose it
> >> > too large, I will probably loose the win of striping.
> >
> >> You lose the advantage of striping if the stripe size is on the order
> >> of the file size. You want stripes which will be narrower than most
> >> of the files you will be using.
> >
> > I wonder if you are looking at a single file or general
> > throughput here.
> >
> > For a single file you may gain reading speed (writing is less
> > critical as it is buffered); however with a stripe size below
> > file size you will need to move the heads of both (or even
> > more) disks, increasing latency[1], effectively slowing down
> > reads unless you have fairly large files.
>
> Assumes that the stripe blocks are not adjacent on their
> respective disk drives. The smaller stripe may have no
> penalty if the logical blocks are adjacent on the disk.
>
> One of the main resons for using LVM at all is to avoid
> having to worry about any of this. If raw speed is a major
> consideration then use hardware RAID5 w/ strip size ==
> I/O block size (e.g., 4 disks w/ 1K chunk and 4K filesystem
> block on linux). This avoids the "extra read" penalty and
> gives nice, distributed reads.
>
> > [1] Your seek time rises -- on the average -- the more heads
> > are in use. With one head you get 1/3 full-seek time for
> > a random head and file locations. With two heads you need
> > to move both heads, chances are that one of them has a
> > longer way than the other.
>
> One advantage of striping is that the seek latency of one
> drive can be used for data I/O on another drive. If the LVM
> system does any sort of double-buffering then the striped
> system can negate/reduce the seek time.
No double-buffering today :-(
LVM just redirects I/Os to the underlying devices (aka physical volumes).
Regards,
Heinz -- The LVM Guy --
>
> This also leaves out the issue of journaled file systems, which
> may have data (or just meta-data) spread out all over the
> place -- leaveing you with fragmented reads even in the case
> of a small file.
>
>
> Net result is that depending on CPU, bus, controller and
> disk hardware and their interactins with the file systems
> and particular type of I/O being performned the answer
> becomes "It Depends" :-)
>
> In 15 years the only method I've found that works consistently
> is to try however many of the recommendations you can before
> comitting to any one of them. Benchmark them under realistic
> condidtions and one will usually be a bit better. At that point you
> can 'reverse engineer' why your particular conditions match
> that particular theory -- and probably learn a bit about how to
> improve your system as a result.
>
>
>
>
> > [4] all disks are 'single point of failure'. Most file systems
> > do not like loosing spots all over the place. But then you
> > do backup religiously, test your backups and have recovery
> > plans in place, yes?
>
> Ah, but it's so much more fun to fiture out how it all works at
> 3 in the morning with 20 users breathing fire down your back!
>
>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
[not found] <134718885.992837796416.JavaMail.root@boots>
@ 2001-06-18 10:33 ` Wolfgang Weisselberg
2001-06-21 8:46 ` Joe Thornber
0 siblings, 1 reply; 12+ messages in thread
From: Wolfgang Weisselberg @ 2001-06-18 10:33 UTC (permalink / raw)
To: linux-lvm
Hi, idsfa!
idsfa@visi.com wrote 63 lines:
> On Sun, Jun 17, 2001 at 11:36:27PM +0200, Wolfgang Weisselberg wrote:
> > For a single file you may gain reading speed (writing is less
> > critical as it is buffered); however with a stripe size below
> > file size you will need to move the heads of both (or even
> > more) disks, increasing latency[1], effectively slowing down
> > reads unless you have fairly large files.
> True, but I was assuming 'typical' use, which means < 2G of small files
> (everything having to do with the OS) with the majority of the hard
> drive space being used for files in the megabyte+ range (multimedia,
> databases and so forth).
Databases can be special cases here, they often only read small
parts of their storage files (retriving a 3-char fields e.g.).
But then many Databases want raw access and probably will do
striping themselves, too.
A maildir or MH-style central mail spool (e.g. qmail) will
contain lots and lots of small to medium files -- one file
per email.
Further, reiserfs is pretty good with many many files, this will
(over some development time) lead to smaller overall file size
as applications will be more often programmed not to aggregate
data into bigger files. But this is something which will not
trouble us much now.
> I can agree with your logical argument that
> larger stripes are better for small files, but I would further argue
> that striping a partition of small files is the wrong way to go.
Again, this depends. If you need parallel access to small
files, use stripes or even RAID10/15 (mirrored stripes/mirrored
RAID5). This would be the case of a maildir imap server, for
example. If this is no bottleneck -- let it be.
> > In conclusion (IMHO):
> > - small stripes increase the latency even for small reads,
> > hurting throughput (and slowing the reads even when looking
> > at a single file).
> I'd have to change that to "increase the latency for small reads".
> For files >> stripe size, you will see no increase in latency.
You will with small stripe sizes[5], but it's usually neglectable,
since you finish earlier than with one disk.
> Something which would balance PEs within a VG based on their usage
> would be a lovely system tool to add to lvm. I can hardly wait for
> your program ;-)
First, we need a fool^Wcrashproof, completely interruptible
pvmove for active, being currently read from and written
to LVs. Once this is there, we need to have a pvmove which
can be told the physical place to move to, else we can only
spread the most accessed PEs over the PVs.
And at the moment pvmove can only partially move non-striped
LVs.
Then the rest is simply ripping a balancing algorithm from
somewhere and slap it into a wrapper. Data aquisition is
already done via lvmsadc/lvmsar.
-Wolfgang
[5] Assume:
- 1st head closer to data than second head (about 50% of
the time)
- small strip size (e.g. 4k)
- no bad fragmentation (normal case)
- idle I/O
You request the first 15k:
- head 1 seeks to stripe, so does head 2.
- head 1 arrives; the platter turns until the begin of the
data; head 1 starts deliver data -- the whole strip (4k)
and internal caching of the following strips begins.
- head 2 should now deliver the second 4k,
*but is still seeking.*
This is the added latency.
- head 2 finishes seeking, the platter turns until the
begin of the data; head 2 delivers the second 4k.
- from here on the data is put out at 'double rate' from
both disks.
With larger strip sizes the added latency occurs less often,
as up to the whole strip size is read and can be delivered
giving the other head more time to finish seeking.
With non-idle I/O the latency for gets worse, of course,
even when reads are reordered to minimize the impact.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
[not found] <134782415.992819826651.JavaMail.root@boots>
@ 2001-06-18 11:19 ` Wolfgang Weisselberg
0 siblings, 0 replies; 12+ messages in thread
From: Wolfgang Weisselberg @ 2001-06-18 11:19 UTC (permalink / raw)
To: linux-lvm
Hi, Steven!
Steven Lembark wrote 81 lines:
> > For a single file you may gain reading speed (writing is less
> > critical as it is buffered); however with a stripe size below
> > file size you will need to move the heads of both (or even
> > more) disks, increasing latency[1], effectively slowing down
> > reads unless you have fairly large files.
> Assumes that the stripe blocks are not adjacent on their
> respective disk drives. The smaller stripe may have no
> penalty if the logical blocks are adjacent on the disk.
Actually, this assumes that the stripe blocks are not adjacent
-- which they are not, they are on different disks with
independent head movements. :-)
But even if all of the following data requires no further
seeks you still need 2 initial seeks, instead of one. Which
is longer on the average.
> One of the main resons for using LVM at all is to avoid
> having to worry about any of this.
LVM is about not having to worry over partition sizes and disk
sizes other than the complete pool size.
> If raw speed is a major
> consideration then use hardware RAID5 w/ strip size ==
> I/O block size (e.g., 4 disks w/ 1K chunk and 4K filesystem
> block on linux). This avoids the "extra read" penalty and
> gives nice, distributed reads.
You get the extra read penalty even with RAID5 and small
strip sizes, as the kernel does read-ahead, for example.
Actually, it gets worse (approaching 2/3 seek time, I
*guess*, on the average) with more disks.
And then you should definitively test if a HW raid controller
(which *is* the easiest thing) is as fast as software raid,
especially in degraded mode and during writes. If you are
having a write-heavy partition, this can turn into a
bottleneck. Also remember that with ext2 you get meta-data
every 8M, so if you find one disk doing overtime, tune that
value during the ext2 creation.
> > [1] Your seek time rises -- on the average -- the more heads
> > are in use. With one head you get 1/3 full-seek time for
> > a random head and file locations. With two heads you need
> > to move both heads, chances are that one of them has a
> > longer way than the other.
> One advantage of striping is that the seek latency of one
> drive can be used for data I/O on another drive.
True, but latency still increases. And it's also true for
larger strip sizes -- with less increase in latency for
each file.
> If the LVM
> system does any sort of double-buffering then the striped
> system can negate/reduce the seek time.
Only if there is enough to be read.
At 17.5 MB/s and 8.9 ms average seek time, you can read just
under 160 Kb in the time of one seek.
> This also leaves out the issue of journaled file systems, which
> may have data (or just meta-data) spread out all over the
> place -- leaveing you with fragmented reads even in the case
> of a small file.
Meta-data is usually cached -- and it does not help if you
need 2 seeks (or even 1.2 seeks on the average) instead of
one for each meta-data fragment.
> Net result is that depending on CPU, bus, controller and
> disk hardware and their interactins with the file systems
> and particular type of I/O being performned the answer
> becomes "It Depends" :-)
Well, ain't it good noone told this Donald E. Knuth before he
did write many highly mathematical things and thoughts about
it in 'The Art of Computer Programming'? :-)
> In 15 years the only method I've found that works consistently
> is to try however many of the recommendations you can before
> comitting to any one of them.
True, that method usually works, if your trials are really
realistic.
> > do not like loosing spots all over the place. But then you
> > do backup religiously, test your backups and have recovery
> > plans in place, yes?
> Ah, but it's so much more fun to fiture out how it all works at
> 3 in the morning with 20 users breathing fire down your back!
For some values of 'fun' approaching microsoftian proportions?
Maybe. Bungie jumping with piano wire as rope tends to be
more enjoyable, I have been told.
-Wolfgang
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
2001-06-18 10:33 ` Wolfgang Weisselberg
@ 2001-06-21 8:46 ` Joe Thornber
2001-06-22 0:59 ` Wolfgang Weisselberg
0 siblings, 1 reply; 12+ messages in thread
From: Joe Thornber @ 2001-06-21 8:46 UTC (permalink / raw)
To: linux-lvm
On Mon, Jun 18, 2001 at 12:33:18PM +0200, Wolfgang Weisselberg wrote:
...
> First, we need a fool^Wcrashproof, completely interruptible
> pvmove for active, being currently read from and written
> to LVs.
I hope this is already there. Do your experiences suggest it isn't
working for active pv's ?
> Once this is there, we need to have a pvmove which
> can be told the physical place to move to, else we can only
> spread the most accessed PEs over the PVs.
This is exactly what we're planning for the next version of LVM. The
moving of extents will be performed by the kernel rather than in user
space, the ioctl interface will allow the user to specify a list of pe
movements.
> Then the rest is simply ripping a balancing algorithm from
> somewhere and slap it into a wrapper. Data aquisition is
> already done via lvmsadc/lvmsar.
Yes, a little Perl script to process the usage stats and then create a
new map.
- Joe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
2001-06-21 8:46 ` Joe Thornber
@ 2001-06-22 0:59 ` Wolfgang Weisselberg
2001-06-22 9:49 ` Joe Thornber
0 siblings, 1 reply; 12+ messages in thread
From: Wolfgang Weisselberg @ 2001-06-22 0:59 UTC (permalink / raw)
To: linux-lvm
Joe Thornber (thornber@btconnect.com) wrote 33 lines:
> On Mon, Jun 18, 2001 at 12:33:18PM +0200, Wolfgang Weisselberg wrote:
> > First, we need a fool^Wcrashproof, completely interruptible
> > pvmove for active, being currently read from and written
> > to LVs.
> I hope this is already there. Do your experiences suggest it isn't
> working for active pv's ?
man pvmove:
[...]
You can move physical extents in use but make sure you
have an current backup in case of a system crash while
moving!!!
Now, this does *not* look like *crashproof*, does it?
This is from the CVS, btw, dated 2001-06-10.
And I am unwilling to have a tool run automatically (at
night?) that - upon a crash - can destroy whole partitions.
I could live with a 'copy, repeat if original was changed,
lock LE, update maps on HD, unlock LE (on new PE)' thing.
I have used pmove about once in ernest, and my, moving 35 Gigs
over to a new HD takes time! And that does not look like a
HD bottleneck...
> This is exactly what we're planning for the next version of LVM. The
> moving of extents will be performed by the kernel rather than in user
> space, the ioctl interface will allow the user to specify a list of pe
> movements.
Sounds interesting.
> > Then the rest is simply ripping a balancing algorithm from
> > somewhere and slap it into a wrapper. Data aquisition is
> > already done via lvmsadc/lvmsar.
> Yes, a little Perl script to process the usage stats and then create a
> new map.
There it might be interesting to know which blocks are accessed
in sequence. Then again it might not, I am not an expert
there and haven't done the maths for that.
-Wolfgang
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
2001-06-22 0:59 ` Wolfgang Weisselberg
@ 2001-06-22 9:49 ` Joe Thornber
2001-06-22 14:25 ` Heinz J. Mauelshagen
0 siblings, 1 reply; 12+ messages in thread
From: Joe Thornber @ 2001-06-22 9:49 UTC (permalink / raw)
To: linux-lvm
On Fri, Jun 22, 2001 at 02:59:26AM +0200, Wolfgang Weisselberg wrote:
> man pvmove:
> [...]
> You can move physical extents in use but make sure you
> have an current backup in case of a system crash while
> moving!!!
>
> Now, this does *not* look like *crashproof*, does it?
> This is from the CVS, btw, dated 2001-06-10.
Heinz,
Do we update the metadata after each individual extent is moved ? Or
at the end of the whole pvmove ?
I'd assumed we were doing the former, in which case this comment from
the man page is unduly pesimistic.
- Joe
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [linux-lvm] What is a good stripe size?
2001-06-22 9:49 ` Joe Thornber
@ 2001-06-22 14:25 ` Heinz J. Mauelshagen
0 siblings, 0 replies; 12+ messages in thread
From: Heinz J. Mauelshagen @ 2001-06-22 14:25 UTC (permalink / raw)
To: linux-lvm
On Fri, Jun 22, 2001 at 10:49:21AM +0100, Joe Thornber wrote:
> On Fri, Jun 22, 2001 at 02:59:26AM +0200, Wolfgang Weisselberg wrote:
> > man pvmove:
> > [...]
> > You can move physical extents in use but make sure you
> > have an current backup in case of a system crash while
> > moving!!!
> >
> > Now, this does *not* look like *crashproof*, does it?
> > This is from the CVS, btw, dated 2001-06-10.
>
> Heinz,
>
> Do we update the metadata after each individual extent is moved ? Or
> at the end of the whole pvmove ?
1st case.
Metadata is updated after every single moved extent.
The only reason why the man page says the above is the non atomic nature
of the metadata updates (2 block devices need to be updated; source and
destination) in case of the version 1 metadata layout.
With v2 (transaction oriented updates) we should be fine and can remove
the above warning.
>
> I'd assumed we were doing the former, in which case this comment from
> the man page is unduly pesimistic.
Yes, it is somehow pesimistic but disks can break, power can be
interupted or a million other things can happen which let the source device
metadata update make it but *not* the destination device update.
We could eventually recover from that situation by updating the destination
before the source and checking double entries for the same logical extent
on the source and destination device at VG activation time in order to remove
one of the two entries.
Heinz
>
> - Joe
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2001-06-22 14:25 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <134933704.992803570667.JavaMail.root@boots>
2001-06-17 21:36 ` [linux-lvm] What is a good stripe size? Wolfgang Weisselberg
2001-06-17 23:13 ` Steven Lembark
2001-06-18 10:04 ` Heinz J. Mauelshagen
2001-06-18 4:07 ` idsfa
[not found] <134782415.992819826651.JavaMail.root@boots>
2001-06-18 11:19 ` Wolfgang Weisselberg
[not found] <134718885.992837796416.JavaMail.root@boots>
2001-06-18 10:33 ` Wolfgang Weisselberg
2001-06-21 8:46 ` Joe Thornber
2001-06-22 0:59 ` Wolfgang Weisselberg
2001-06-22 9:49 ` Joe Thornber
2001-06-22 14:25 ` Heinz J. Mauelshagen
2001-06-16 13:29 Urs Thuermann
2001-06-17 18:44 ` idsfa
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.