* Re: File system compression, not at the block layer
2004-04-23 20:34 ` Richard B. Johnson
@ 2004-04-23 20:44 ` Måns Rullgård
2004-04-23 20:59 ` Richard B. Johnson
2004-04-23 21:31 ` Joel Jaeggli
` (3 subsequent siblings)
4 siblings, 1 reply; 43+ messages in thread
From: Måns Rullgård @ 2004-04-23 20:44 UTC (permalink / raw)
To: linux-kernel
"Richard B. Johnson" <root@chaos.analogic.com> writes:
> On Fri, 23 Apr 2004, Joel Jaeggli wrote:
>
>> On Fri, 23 Apr 2004, Paul Jackson wrote:
>>
>> > > SO... in addition to the brilliance of AS, is there anything else that
>> > > can be done (using compression or something else) which could aid in
>> > > reducing seek time?
>> >
>> > Buy more disks and only use a small portion of each for all but the
>> > most infrequently accessed data.
>>
>> faster drives. The biggest disks at this point are far slower that the
>> fastest... the average read service time on a maxtor atlas 15k is like
>> 5.7ms on 250GB western digital sata, 14.1ms, so that more than twice as
>> many reads can be executed on the fastest disks you can buy now... of
>> course then you pay for it in cost, heat, density, and controller costs.
>> everthing is a tradeoff though.
>>
>
> If you want to have fast disks, then you should do what I
> suggested to Digital 20 years ago when they had ST-506
> interfaces and SCSI was available only from third-parties.
> It was called "striping" (I'm serious!). Not the so-called
> RAID crap that took the original idea and destroyed it.
> If you have 32-bits, you design an interface board for 32
> disks. The interface board strips each bit to the data that
> each disk gets. That makes the whole array 32 times faster
> than a single drive and, of course, 32 times larger.
For best performance, the spindles should be synchronized too. This
might be tricky with disks not intended for such operation, of course.
--
Måns Rullgård
mru@kth.se
^ permalink raw reply [flat|nested] 43+ messages in thread* Re: File system compression, not at the block layer
2004-04-23 20:44 ` Måns Rullgård
@ 2004-04-23 20:59 ` Richard B. Johnson
2004-04-23 21:14 ` Ben Greear
2004-04-23 21:18 ` Timothy Miller
0 siblings, 2 replies; 43+ messages in thread
From: Richard B. Johnson @ 2004-04-23 20:59 UTC (permalink / raw)
To: Måns Rullgård; +Cc: linux-kernel
On Fri, 23 Apr 2004, [iso-8859-1] Måns Rullgård wrote:
> "Richard B. Johnson" <root@chaos.analogic.com> writes:
>
> > On Fri, 23 Apr 2004, Joel Jaeggli wrote:
> >
> >> On Fri, 23 Apr 2004, Paul Jackson wrote:
> >>
> >> > > SO... in addition to the brilliance of AS, is there anything else that
> >> > > can be done (using compression or something else) which could aid in
> >> > > reducing seek time?
> >> >
> >> > Buy more disks and only use a small portion of each for all but the
> >> > most infrequently accessed data.
> >>
> >> faster drives. The biggest disks at this point are far slower that the
> >> fastest... the average read service time on a maxtor atlas 15k is like
> >> 5.7ms on 250GB western digital sata, 14.1ms, so that more than twice as
> >> many reads can be executed on the fastest disks you can buy now... of
> >> course then you pay for it in cost, heat, density, and controller costs.
> >> everthing is a tradeoff though.
> >>
> >
> > If you want to have fast disks, then you should do what I
> > suggested to Digital 20 years ago when they had ST-506
> > interfaces and SCSI was available only from third-parties.
> > It was called "striping" (I'm serious!). Not the so-called
> > RAID crap that took the original idea and destroyed it.
> > If you have 32-bits, you design an interface board for 32
> > disks. The interface board strips each bit to the data that
> > each disk gets. That makes the whole array 32 times faster
> > than a single drive and, of course, 32 times larger.
>
> For best performance, the spindles should be synchronized too. This
> might be tricky with disks not intended for such operation, of course.
Actually not. You need a FIFO to cache your bits into buffers of bytes
anyway. Depending upon the length of the FIFO, you can "rubber-band" a
lot of rotational latency. When you are dealing with a lot of drives,
you are never going to have all the write currents turn on at the same
time anyway because they are (very) soft-sectored, i.e., block
replacement, etc.
Your argument was used to shout down the idea. Actually, I think
it was lost in the NIH syndrome anyway.
>
> --
> Måns Rullgård
> mru@kth.se
>
Cheers,
Dick Johnson
Penguin : Linux version 2.4.26 on an i686 machine (5557.45 BogoMips).
Note 96.31% of all statistics are fiction.
^ permalink raw reply [flat|nested] 43+ messages in thread* Re: File system compression, not at the block layer
2004-04-23 20:59 ` Richard B. Johnson
@ 2004-04-23 21:14 ` Ben Greear
2004-04-23 21:25 ` Timothy Miller
2004-04-23 21:18 ` Timothy Miller
1 sibling, 1 reply; 43+ messages in thread
From: Ben Greear @ 2004-04-23 21:14 UTC (permalink / raw)
To: root; +Cc: Måns Rullgård, linux-kernel
Richard B. Johnson wrote:
> Actually not. You need a FIFO to cache your bits into buffers of bytes
> anyway. Depending upon the length of the FIFO, you can "rubber-band" a
> lot of rotational latency. When you are dealing with a lot of drives,
> you are never going to have all the write currents turn on at the same
> time anyway because they are (very) soft-sectored, i.e., block
> replacement, etc.
Wouldn't this pretty much guarantee worst-case latency scenario for reading, since
on average at least one of your 32 disks is going to require a full rotation
(and probably a seek) to find it's bit?
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-23 21:14 ` Ben Greear
@ 2004-04-23 21:25 ` Timothy Miller
2004-04-24 4:58 ` Ben Greear
0 siblings, 1 reply; 43+ messages in thread
From: Timothy Miller @ 2004-04-23 21:25 UTC (permalink / raw)
To: Ben Greear; +Cc: root, Måns Rullgård, linux-kernel
Ben Greear wrote:
> Richard B. Johnson wrote:
>
>> Actually not. You need a FIFO to cache your bits into buffers of bytes
>> anyway. Depending upon the length of the FIFO, you can "rubber-band" a
>> lot of rotational latency. When you are dealing with a lot of drives,
>> you are never going to have all the write currents turn on at the same
>> time anyway because they are (very) soft-sectored, i.e., block
>> replacement, etc.
>
>
> Wouldn't this pretty much guarantee worst-case latency scenario for
> reading, since
> on average at least one of your 32 disks is going to require a full
> rotation
> (and probably a seek) to find it's bit?
Only for the first bit of a block. For large streams of reads, the
fifos will keep things going, except for occasionally as drives drift in
their relative rotation positions which can cause some delays.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-23 21:25 ` Timothy Miller
@ 2004-04-24 4:58 ` Ben Greear
2004-04-27 15:45 ` Timothy Miller
0 siblings, 1 reply; 43+ messages in thread
From: Ben Greear @ 2004-04-24 4:58 UTC (permalink / raw)
To: Timothy Miller; +Cc: root, linux-kernel
Timothy Miller wrote:
>> Wouldn't this pretty much guarantee worst-case latency scenario for
>> reading, since
>> on average at least one of your 32 disks is going to require a full
>> rotation
>> (and probably a seek) to find it's bit?
>
>
>
> Only for the first bit of a block. For large streams of reads, the
> fifos will keep things going, except for occasionally as drives drift in
> their relative rotation positions which can cause some delays.
So how is that better than using a striping raid that stripes at the
block level or multi-block level?
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-24 4:58 ` Ben Greear
@ 2004-04-27 15:45 ` Timothy Miller
0 siblings, 0 replies; 43+ messages in thread
From: Timothy Miller @ 2004-04-27 15:45 UTC (permalink / raw)
To: Ben Greear; +Cc: root, linux-kernel
Ben Greear wrote:
> Timothy Miller wrote:
>
>>> Wouldn't this pretty much guarantee worst-case latency scenario for
>>> reading, since
>>> on average at least one of your 32 disks is going to require a full
>>> rotation
>>> (and probably a seek) to find it's bit?
>>
>>
>>
>>
>> Only for the first bit of a block. For large streams of reads, the
>> fifos will keep things going, except for occasionally as drives drift
>> in their relative rotation positions which can cause some delays.
>
>
> So how is that better than using a striping raid that stripes at the
> block level or multi-block level?
>
It's only better for large streaming writes. The FIFOs I'm talking
about above would certainly be smaller than typical RAID0 stripes.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-23 20:59 ` Richard B. Johnson
2004-04-23 21:14 ` Ben Greear
@ 2004-04-23 21:18 ` Timothy Miller
2004-04-24 1:28 ` Horst von Brand
2004-04-24 2:24 ` Tom Vier
1 sibling, 2 replies; 43+ messages in thread
From: Timothy Miller @ 2004-04-23 21:18 UTC (permalink / raw)
To: root; +Cc: Måns Rullgård, linux-kernel
Richard B. Johnson wrote:
>
> Actually not. You need a FIFO to cache your bits into buffers of bytes
> anyway. Depending upon the length of the FIFO, you can "rubber-band" a
> lot of rotational latency. When you are dealing with a lot of drives,
> you are never going to have all the write currents turn on at the same
> time anyway because they are (very) soft-sectored, i.e., block
> replacement, etc.
>
> Your argument was used to shout down the idea. Actually, I think
> it was lost in the NIH syndrome anyway.
>
In a drive with multiple platters and therefore multiple heads, you
could read/write from all heads simultaneously. Or is that how they
already do it?
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-23 21:18 ` Timothy Miller
@ 2004-04-24 1:28 ` Horst von Brand
2004-04-24 2:24 ` Tom Vier
1 sibling, 0 replies; 43+ messages in thread
From: Horst von Brand @ 2004-04-24 1:28 UTC (permalink / raw)
To: Timothy Miller; +Cc: Linux Kernel Mailing List
Timothy Miller <miller@techsource.com> said:
[...]
> In a drive with multiple platters and therefore multiple heads, you
> could read/write from all heads simultaneously. Or is that how they
> already do it?
No. Current disks have bad blocks (way too small on disk to be able to
ensure 100% OK), and they are remapped by the drive firmware to spare
cilinders. To have the exact same blocks broken on each surface would be a
real lottery.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-23 21:18 ` Timothy Miller
2004-04-24 1:28 ` Horst von Brand
@ 2004-04-24 2:24 ` Tom Vier
2004-04-24 7:36 ` Willy Tarreau
2004-04-27 15:43 ` Timothy Miller
1 sibling, 2 replies; 43+ messages in thread
From: Tom Vier @ 2004-04-24 2:24 UTC (permalink / raw)
To: linux-kernel
On Fri, Apr 23, 2004 at 05:18:44PM -0400, Timothy Miller wrote:
> In a drive with multiple platters and therefore multiple heads, you
> could read/write from all heads simultaneously. Or is that how they
> already do it?
fwih, there was once a drive that did this. the problem is track alignment.
these days, you'd need seperate motors for each head.
--
Tom Vier <tmv@comcast.net>
DSA Key ID 0x15741ECE
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-24 2:24 ` Tom Vier
@ 2004-04-24 7:36 ` Willy Tarreau
2004-04-24 16:02 ` Eric D. Mudama
2004-04-25 3:05 ` Horst von Brand
2004-04-27 15:43 ` Timothy Miller
1 sibling, 2 replies; 43+ messages in thread
From: Willy Tarreau @ 2004-04-24 7:36 UTC (permalink / raw)
To: Tom Vier; +Cc: linux-kernel
On Fri, Apr 23, 2004 at 10:24:58PM -0400, Tom Vier wrote:
> On Fri, Apr 23, 2004 at 05:18:44PM -0400, Timothy Miller wrote:
> > In a drive with multiple platters and therefore multiple heads, you
> > could read/write from all heads simultaneously. Or is that how they
> > already do it?
>
> fwih, there was once a drive that did this. the problem is track alignment.
> these days, you'd need seperate motors for each head.
I think they now all do it. Haven't you noticed that drives with many
platters are always faster than their cousins with fewer platters ? And
I don't speak about access time, but about sequential reads.
Willy
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-24 7:36 ` Willy Tarreau
@ 2004-04-24 16:02 ` Eric D. Mudama
2004-04-25 3:05 ` Horst von Brand
1 sibling, 0 replies; 43+ messages in thread
From: Eric D. Mudama @ 2004-04-24 16:02 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Tom Vier, linux-kernel
On Sat, Apr 24 at 9:36, Willy Tarreau wrote:
>On Fri, Apr 23, 2004 at 10:24:58PM -0400, Tom Vier wrote:
>> On Fri, Apr 23, 2004 at 05:18:44PM -0400, Timothy Miller wrote:
>> > In a drive with multiple platters and therefore multiple heads, you
>> > could read/write from all heads simultaneously. Or is that how they
>> > already do it?
>>
>> fwih, there was once a drive that did this. the problem is track alignment.
>> these days, you'd need seperate motors for each head.
>
>I think they now all do it. Haven't you noticed that drives with many
>platters are always faster than their cousins with fewer platters ? And
>I don't speak about access time, but about sequential reads.
Only one read/write element can be active at one time in a modern disk
drive. The issue is that while the drive's headstack was originally
in alignment, all sorts of factors can cause it to fall out of
alignment. If that occurs, the heads might not line up with each
other, meaning that when you used to line up with A1 and B1 (side A,
cylinder 1) your two heads now align with A1 and B40.
Every surface has embedded servo information, which allows the drive
to work around mechanical variability and handling damage. The
difference in position between adjacent heads in a drive factors into
a parameter called "head switch skew". Head switch skew is "how long
does it take us to seek to the next sequential LBA after reading the
last LBA on a track/head?" Track-to-track skew is how long to seek
and settle on the adjacent track on the same head.
These two parameters are used to generate the drive's format, which in
turn account for the sequential throughput. (higher skews means lower
usage duty cycle means lower overall throughput.) If the skews are
set too low, the drive blows revs because it can't settle in time for
the LBA it needs to read.
In general, a drive with lots of heads will perform better on most
workloads because it doesn't have to seek as far radially to cover the
same amount of data. However, a single-headed and a multi-headed
drive of the same generation should be virtually identical in
sequential throughput... within a few percent. If anything, the
single-headed drive should be a bit faster because track-to-track
skews are typically smaller than headswitch skews.
--eric
--
Eric D. Mudama
edmudama@mail.bounceswoosh.org
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-24 7:36 ` Willy Tarreau
2004-04-24 16:02 ` Eric D. Mudama
@ 2004-04-25 3:05 ` Horst von Brand
2004-04-25 7:29 ` Willy Tarreau
1 sibling, 1 reply; 43+ messages in thread
From: Horst von Brand @ 2004-04-25 3:05 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Linux Kernel Mailing List
Willy Tarreau <willy@w.ods.org> said:
> On Fri, Apr 23, 2004 at 10:24:58PM -0400, Tom Vier wrote:
> > On Fri, Apr 23, 2004 at 05:18:44PM -0400, Timothy Miller wrote:
> > > In a drive with multiple platters and therefore multiple heads, you
> > > could read/write from all heads simultaneously. Or is that how they
> > > already do it?
> >
> > fwih, there was once a drive that did this. the problem is track alignment.
> > these days, you'd need seperate motors for each head.
> I think they now all do it.
No.
> Haven't you noticed that drives with many
> platters are always faster than their cousins with fewer platters ? And
> I don't speak about access time, but about sequential reads.
Have you ever wondered how they squeeze 16 or more platters into that slim
enclosure? If you take them apart, the question evaporates: There are 2 or
3 platters in them, no more. The "many platters" are an artifact of BIOS'
"disk geometry" description.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-25 3:05 ` Horst von Brand
@ 2004-04-25 7:29 ` Willy Tarreau
2004-04-25 19:50 ` Eric D. Mudama
0 siblings, 1 reply; 43+ messages in thread
From: Willy Tarreau @ 2004-04-25 7:29 UTC (permalink / raw)
To: Horst von Brand; +Cc: Linux Kernel Mailing List
On Sat, Apr 24, 2004 at 11:05:05PM -0400, Horst von Brand wrote:
> > Haven't you noticed that drives with many
> > platters are always faster than their cousins with fewer platters ? And
> > I don't speak about access time, but about sequential reads.
>
> Have you ever wondered how they squeeze 16 or more platters into that slim
> enclosure? If you take them apart, the question evaporates: There are 2 or
> 3 platters in them, no more. The "many platters" are an artifact of BIOS'
> "disk geometry" description.
I know, I was speaking about physical platters of course. Mark Hann told
me in private that he disagreed with me, so I checked recent disks
(36, 73, 147 GB SCSI with 1, 2, 4 platters) and he was right, they have
exactly the same spec concerning speed. But I said that I remember the
times when I regularly did this test on disks that I was integrating about
7-8 years ago, they were 2.1, 4.3, 6.4 GB (1,2,3 platters), and I'm fairly
certain that the 1-platter performed at about 5 MB/s while the 6.4 was around
12 MB/s. BTW, the 9GB SCSI I have in my PC does about 28 MB/s for 1 platter,
while its 18 GB equivalent (2 platters) does about 51. So I think that what
I observed remained true for such capacities, but changed on bigger disks
because of mechanical constraints. Afterall, what's 18 GB now ? Less than
one twentieth of the biggest disk.
Anyway, this is off-topic, so that's my last post on LKML on the subject.
Regards,
Willy
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-25 7:29 ` Willy Tarreau
@ 2004-04-25 19:50 ` Eric D. Mudama
0 siblings, 0 replies; 43+ messages in thread
From: Eric D. Mudama @ 2004-04-25 19:50 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Horst von Brand, Linux Kernel Mailing List
On Sun, Apr 25 at 9:29, Willy Tarreau wrote:
>I know, I was speaking about physical platters of course. Mark Hann told
>me in private that he disagreed with me, so I checked recent disks
>(36, 73, 147 GB SCSI with 1, 2, 4 platters) and he was right, they have
>exactly the same spec concerning speed. But I said that I remember the
>times when I regularly did this test on disks that I was integrating about
>7-8 years ago, they were 2.1, 4.3, 6.4 GB (1,2,3 platters), and I'm fairly
>certain that the 1-platter performed at about 5 MB/s while the 6.4 was around
>12 MB/s. BTW, the 9GB SCSI I have in my PC does about 28 MB/s for 1 platter,
>while its 18 GB equivalent (2 platters) does about 51. So I think that what
>I observed remained true for such capacities, but changed on bigger disks
>because of mechanical constraints. Afterall, what's 18 GB now ? Less than
>one twentieth of the biggest disk.
>
>Anyway, this is off-topic, so that's my last post on LKML on the subject.
Let me throw in a final $.02...
Are you sure your 9GB and 18GB drives are of the same "generation" of
technology? SCSI drive platters have gotten smaller and smaller to
shorten the seek distance (they use 2.5" media now inside 3.5" drives)
for random operations, and I'm wondering if your 18GB is in fact a
generation ahead of your 9GB.
Are you sure your 9GB SCSI drive only has 1 platter in it?
--eric
--
Eric D. Mudama
edmudama@mail.bounceswoosh.org
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-24 2:24 ` Tom Vier
2004-04-24 7:36 ` Willy Tarreau
@ 2004-04-27 15:43 ` Timothy Miller
2004-04-28 0:29 ` Tom Vier
1 sibling, 1 reply; 43+ messages in thread
From: Timothy Miller @ 2004-04-27 15:43 UTC (permalink / raw)
To: Tom Vier; +Cc: linux-kernel
Tom Vier wrote:
> On Fri, Apr 23, 2004 at 05:18:44PM -0400, Timothy Miller wrote:
>
>>In a drive with multiple platters and therefore multiple heads, you
>>could read/write from all heads simultaneously. Or is that how they
>>already do it?
>
>
> fwih, there was once a drive that did this. the problem is track alignment.
> these days, you'd need seperate motors for each head.
>
Oh, yeah. Forget the separate motors. Would definately need that to
move heads independently.
The problem is track alignment. Don't drives dedicate one track on one
platter as an alignment track?
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-27 15:43 ` Timothy Miller
@ 2004-04-28 0:29 ` Tom Vier
0 siblings, 0 replies; 43+ messages in thread
From: Tom Vier @ 2004-04-28 0:29 UTC (permalink / raw)
To: Timothy Miller; +Cc: linux-kernel
On Tue, Apr 27, 2004 at 11:43:58AM -0400, Timothy Miller wrote:
> The problem is track alignment. Don't drives dedicate one track on one
> platter as an alignment track?
it used to be one whole plater was for servo alignment, i think. embedded
servo signals have been around for at least 7 years.
--
Tom Vier <tmv@comcast.net>
DSA Key ID 0x15741ECE
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-23 20:34 ` Richard B. Johnson
2004-04-23 20:44 ` Måns Rullgård
@ 2004-04-23 21:31 ` Joel Jaeggli
2004-04-23 22:20 ` Ian Stirling
2004-04-23 23:34 ` Paul Jackson
` (2 subsequent siblings)
4 siblings, 1 reply; 43+ messages in thread
From: Joel Jaeggli @ 2004-04-23 21:31 UTC (permalink / raw)
To: Richard B. Johnson
Cc: Paul Jackson, Timothy Miller, tytso, miquels, linux-kernel
On Fri, 23 Apr 2004, Richard B. Johnson wrote:
>
> If you want to have fast disks, then you should do what I
> suggested to Digital 20 years ago when they had ST-506
> interfaces and SCSI was available only from third-parties.
> It was called "striping" (I'm serious!). Not the so-called
> RAID crap that took the original idea and destroyed it.
> If you have 32-bits, you design an interface board for 32
> disks. The interface board strips each bit to the data that
> each disk gets. That makes the whole array 32 times faster
> than a single drive and, of course, 32 times larger.
>
> There is no redundancy in such an array, just brute-force
> speed. One can add additional bits and CRC correction which
> would allow the failure (or removal) of one drive at a time.
except disks no longer encode one bit at a time (with prml), and you're
still serializing requests across all the spindles instead of dividing
requests between spindles... it's pretty clear that in the forseeable
future capacity grown will continue to far outstrip access speed in
spinning magnetic media. I would agree that any serious improvement is
likely to come for more creativly arranging the data at the block or
filesystem level, netapps log-structured raid4 being one direction to
head...
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.4.26 on an i686 machine (5557.45 BogoMips).
> Note 96.31% of all statistics are fiction.
>
>
--
--------------------------------------------------------------------------
Joel Jaeggli Unix Consulting joelja@darkwing.uoregon.edu
GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2
^ permalink raw reply [flat|nested] 43+ messages in thread* Re: File system compression, not at the block layer
2004-04-23 21:31 ` Joel Jaeggli
@ 2004-04-23 22:20 ` Ian Stirling
0 siblings, 0 replies; 43+ messages in thread
From: Ian Stirling @ 2004-04-23 22:20 UTC (permalink / raw)
To: Joel Jaeggli
Cc: Richard B. Johnson, Paul Jackson, Timothy Miller, tytso, miquels,
linux-kernel
Joel Jaeggli wrote:
> On Fri, 23 Apr 2004, Richard B. Johnson wrote:
>
>>If you want to have fast disks, then you should do what I
>>suggested to Digital 20 years ago when they had ST-506
>>interfaces and SCSI was available only from third-parties.
> except disks no longer encode one bit at a time (with prml), and you're
> still serializing requests across all the spindles instead of dividing
> requests between spindles... it's pretty clear that in the forseeable
> future capacity grown will continue to far outstrip access speed in
> spinning magnetic media. I would agree that any serious improvement is
I happened to do some sums about a week ago.
My first drive was ST225R, which was 60M,3600RPM and the whole drive could be
read in 2 or 3 mins.
My new 160G drive is 7200RPM, and reads in around 50 mins.
It's not a complete coincidence that sqrt(160/.06) is about 50, and the number
of revs to read the drive is pretty much dead on 50 times.
The areal density of disk drives tends to go up both by adding more tracks, and
by squeezing the data into each track more densely.
While you can speed up the disk maybe 5 times if you are willing to pay the price,
the increasing number of tracks means that you'r still going to need lots more
revs to read the drive.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-23 20:34 ` Richard B. Johnson
2004-04-23 20:44 ` Måns Rullgård
2004-04-23 21:31 ` Joel Jaeggli
@ 2004-04-23 23:34 ` Paul Jackson
2004-04-27 15:42 ` Timothy Miller
2004-04-24 1:18 ` Horst von Brand
2004-04-26 10:22 ` Jörn Engel
4 siblings, 1 reply; 43+ messages in thread
From: Paul Jackson @ 2004-04-23 23:34 UTC (permalink / raw)
To: root; +Cc: joelja, miller, tytso, miquels, linux-kernel
> If you want to have fast disks, then you should do what I
> suggested to Digital 20 years ago when they had ST-506
> interfaces and SCSI was available only from third-parties.
> It was called "striping" (I'm serious!).
That gets your bandwidth up, but does nothing for latency.
Depending on your workload, that may or may not be critical.
As a former SGI employee noted:
"Money can buy bandwidth, but latency is forever" -- John Mashey
To get latency down, you need fast rotating disks and short strokes
(waste most of the disk on little used data, or on nothing at all).
And even that won't get you much faster than 20 years ago.
That, or lots of main memory, or if the data is pretty much
read-only, perhaps some complicated data duplication.
But we're not in such bad shape there - folks have been dealing
with that speed difference for at least 20 years ;).
It's the speed difference between the processor and main memory
that's more challenging now - as it approaches speed differences
we once saw between processor and disk.
To heck with disk compression - it's time for main memory compression.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.650.933.1373
^ permalink raw reply [flat|nested] 43+ messages in thread* Re: File system compression, not at the block layer
2004-04-23 23:34 ` Paul Jackson
@ 2004-04-27 15:42 ` Timothy Miller
2004-04-27 16:02 ` Jörn Engel
0 siblings, 1 reply; 43+ messages in thread
From: Timothy Miller @ 2004-04-27 15:42 UTC (permalink / raw)
To: Paul Jackson; +Cc: root, joelja, tytso, miquels, linux-kernel
Paul Jackson wrote:
>
> To heck with disk compression - it's time for main memory compression.
>
I think nVidia and ATI chips do that with the Z buffer. Definately
improves bandwidth utilization.
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-27 15:42 ` Timothy Miller
@ 2004-04-27 16:02 ` Jörn Engel
0 siblings, 0 replies; 43+ messages in thread
From: Jörn Engel @ 2004-04-27 16:02 UTC (permalink / raw)
To: Timothy Miller; +Cc: Paul Jackson, root, joelja, tytso, miquels, linux-kernel
On Tue, 27 April 2004 11:42:11 -0400, Timothy Miller wrote:
> Paul Jackson wrote:
>
> >To heck with disk compression - it's time for main memory compression.
>
> I think nVidia and ATI chips do that with the Z buffer. Definately
> improves bandwidth utilization.
^^^^^^^^^
Well stated. For general purpose cpus with unpredictable access
patterns, compression makes latency even worse, so you need even
bigger caches.
On the other hand, memory compression makes memory bigger, and memory
of course is a disk cache, so it does improve latency somewhere.
Jörn
--
Victory in war is not repetitious.
-- Sun Tzu
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: File system compression, not at the block layer
2004-04-23 20:34 ` Richard B. Johnson
` (2 preceding siblings ...)
2004-04-23 23:34 ` Paul Jackson
@ 2004-04-24 1:18 ` Horst von Brand
2004-04-26 10:22 ` Jörn Engel
4 siblings, 0 replies; 43+ messages in thread
From: Horst von Brand @ 2004-04-24 1:18 UTC (permalink / raw)
To: root; +Cc: Linux Kernel Mailing List
"Richard B. Johnson" <root@chaos.analogic.com> said:
[...]
> If you want to have fast disks, then you should do what I
> suggested to Digital 20 years ago when they had ST-506
> interfaces and SCSI was available only from third-parties.
> It was called "striping" (I'm serious!). Not the so-called
> RAID crap that took the original idea and destroyed it.
> If you have 32-bits, you design an interface board for 32
> disks. The interface board strips each bit to the data that
> each disk gets. That makes the whole array 32 times faster
> than a single drive and, of course, 32 times larger.
But seeks are just as slow as before... and weigh in more as sectors are
shorter (for the same visible sector size, 1/32th). I'm not so sure this is
a win overall.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
^ permalink raw reply [flat|nested] 43+ messages in thread* Re: File system compression, not at the block layer
2004-04-23 20:34 ` Richard B. Johnson
` (3 preceding siblings ...)
2004-04-24 1:18 ` Horst von Brand
@ 2004-04-26 10:22 ` Jörn Engel
4 siblings, 0 replies; 43+ messages in thread
From: Jörn Engel @ 2004-04-26 10:22 UTC (permalink / raw)
To: Richard B. Johnson; +Cc: Timothy Miller, linux-kernel
On Fri, 23 April 2004 16:34:21 -0400, Richard B. Johnson wrote:
>
> If you want to have fast disks, then you should do what I
> suggested to Digital 20 years ago when they had ST-506
> interfaces and SCSI was available only from third-parties.
> It was called "striping" (I'm serious!). Not the so-called
> RAID crap that took the original idea and destroyed it.
> If you have 32-bits, you design an interface board for 32
> disks. The interface board strips each bit to the data that
> each disk gets. That makes the whole array 32 times faster
> than a single drive and, of course, 32 times larger.
>
> There is no redundancy in such an array, just brute-force
> speed. One can add additional bits and CRC correction which
> would allow the failure (or removal) of one drive at a time.
...and so you add latency to the ever-growing list of concepts you
publically prove to be unaware of.
Those 32 disks now have something like 32x50MB/s or 1.6GB/s, great.
Seek time is still 10ms, though, so now each seek costs as much as
16MB of continuous data transfer. Nice. So readahead will be 64MB,
and disk cache 1GB, just to get rid of some seeks again? Sure.
If you were a little smarter and used the so-called RAID crap, you
would have stripes of about the readahead size (or more) and seeks get
spread up between disks. Sure, transfer speed will usually be lower
than 1.6GB/s, but who cares. The point is that each seek will only
cost you as much as 500kB of continuous transfer.
But like so many other things, you will refuse to understand this as
well, right? Well, at least don't try to convince the unaware,
please.
Jörn
--
There's nothing better for promoting creativity in a medium than
making an audience feel "Hmm I could do better than that!"
-- Douglas Adams in a slashdot interview
^ permalink raw reply [flat|nested] 43+ messages in thread