* dm-multipath has great throughput but we'd like more!
@ 2006-05-18 7:05 Bob Gautier
2006-05-18 7:19 ` [Consult-list] " Bob Gautier
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Bob Gautier @ 2006-05-18 7:05 UTC (permalink / raw)
To: consult-list, dm-devel
Yesterday my client was testing of multipath load balancing and failover
on a system running ext3 on a logical volume which comprises about ten
SAN LUNs all reached using multipath in multibus mode over two QL2340
HBAs.
On the one hand, the client is very impressed: running bonnie++
(inspired by Ronan's GFS v VxFS example) we get just over 200Mbyte/s
over the two HBAs, and when we pull a link we get about 120MByte/s.
The throughput and failover response times are better than the client
has ever seen, but we're wondering why we are not seeing higher
throughput per-HBA -- the QL2340 datasheet says it should manage
200Mbyte/s and all switches etc. run at 2GBps.
Any ideas?
Bob Gautier
+44 7921 700996
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Consult-list] dm-multipath has great throughput but we'd like more!
2006-05-18 7:05 dm-multipath has great throughput but we'd like more! Bob Gautier
@ 2006-05-18 7:19 ` Bob Gautier
2006-05-18 7:27 ` Luca Berra
2006-05-18 7:25 ` Jonathan E Brassow
2006-05-18 17:00 ` [Consult-list] " Rod Nayfield
2 siblings, 1 reply; 15+ messages in thread
From: Bob Gautier @ 2006-05-18 7:19 UTC (permalink / raw)
To: consult-list; +Cc: dm-devel
On Thu, 2006-05-18 at 08:05 +0100, Bob Gautier wrote:
> Yesterday my client was testing of multipath load balancing and failover
> on a system running ext3 on a logical volume which comprises about ten
> SAN LUNs all reached using multipath in multibus mode over two QL2340
> HBAs.
Sorry, should have also said: RHEL4u3 x86_64 on DL385 with 16Gb memory
Bob G
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: dm-multipath has great throughput but we'd like more!
2006-05-18 7:05 dm-multipath has great throughput but we'd like more! Bob Gautier
2006-05-18 7:19 ` [Consult-list] " Bob Gautier
@ 2006-05-18 7:25 ` Jonathan E Brassow
2006-05-18 7:44 ` Bob Gautier
2006-05-18 17:00 ` [Consult-list] " Rod Nayfield
2 siblings, 1 reply; 15+ messages in thread
From: Jonathan E Brassow @ 2006-05-18 7:25 UTC (permalink / raw)
To: rgautier, device-mapper development
The system bus isn't a limiting factor is it? 64-bit PCI-X will get
8.5 GB/s (plenty), but 32-bit PCI 33MHz got 133MB/s.
Can your disks sustain that much bandwidth? 10 striped drives might get
better than 200MB/s if done right, I suppose.
Don't the switches run at 2 Gbits/s? 2 Gbits/s / 10 (throw in 2 bits
for protocol) ~= 200MB/s.
Could be a bunch of reasons...
brassow
On May 18, 2006, at 2:05 AM, Bob Gautier wrote:
> Yesterday my client was testing of multipath load balancing and
> failover
> on a system running ext3 on a logical volume which comprises about ten
> SAN LUNs all reached using multipath in multibus mode over two QL2340
> HBAs.
>
> On the one hand, the client is very impressed: running bonnie++
> (inspired by Ronan's GFS v VxFS example) we get just over 200Mbyte/s
> over the two HBAs, and when we pull a link we get about 120MByte/s.
>
> The throughput and failover response times are better than the client
> has ever seen, but we're wondering why we are not seeing higher
> throughput per-HBA -- the QL2340 datasheet says it should manage
> 200Mbyte/s and all switches etc. run at 2GBps.
>
> Any ideas?
>
> Bob Gautier
> +44 7921 700996
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: [Consult-list] dm-multipath has great throughput but we'd like more!
2006-05-18 7:19 ` [Consult-list] " Bob Gautier
@ 2006-05-18 7:27 ` Luca Berra
2006-05-18 7:36 ` Jonathan E Brassow
0 siblings, 1 reply; 15+ messages in thread
From: Luca Berra @ 2006-05-18 7:27 UTC (permalink / raw)
To: rgautier, device-mapper development
On Thu, May 18, 2006 at 08:19:09AM +0100, Bob Gautier wrote:
>On Thu, 2006-05-18 at 08:05 +0100, Bob Gautier wrote:
>> Yesterday my client was testing of multipath load balancing and failover
>> on a system running ext3 on a logical volume which comprises about ten
>> SAN LUNs all reached using multipath in multibus mode over two QL2340
>> HBAs.
>
>Sorry, should have also said: RHEL4u3 x86_64 on DL385 with 16Gb memory
>
you could have added the storage model :)
btw did you do any tuning (i.e. readhaead, scheduler) ?
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: [Consult-list] dm-multipath has great throughput but we'd like more!
2006-05-18 7:27 ` Luca Berra
@ 2006-05-18 7:36 ` Jonathan E Brassow
2006-05-18 7:44 ` Luca Berra
0 siblings, 1 reply; 15+ messages in thread
From: Jonathan E Brassow @ 2006-05-18 7:36 UTC (permalink / raw)
To: device-mapper development
On May 18, 2006, at 2:27 AM, Luca Berra wrote:
> On Thu, May 18, 2006 at 08:19:09AM +0100, Bob Gautier wrote:
>> On Thu, 2006-05-18 at 08:05 +0100, Bob Gautier wrote:
>>> Yesterday my client was testing of multipath load balancing and
>>> failover
>>> on a system running ext3 on a logical volume which comprises about
>>> ten
>>> SAN LUNs all reached using multipath in multibus mode over two QL2340
>>> HBAs.
>>
>> Sorry, should have also said: RHEL4u3 x86_64 on DL385 with 16Gb memory
>>
> you could have added the storage model :)
> btw did you do any tuning (i.e. readhaead, scheduler) ?
>
You may also want to do raw through-put testing (lmdd perhaps) to get a
baseline. Certainly, the file system has to be bringing down the
bandwidth somewhat...
brassow
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: dm-multipath has great throughput but we'd like more!
2006-05-18 7:25 ` Jonathan E Brassow
@ 2006-05-18 7:44 ` Bob Gautier
2006-05-18 7:55 ` Jonathan E Brassow
` (3 more replies)
0 siblings, 4 replies; 15+ messages in thread
From: Bob Gautier @ 2006-05-18 7:44 UTC (permalink / raw)
To: Jonathan E Brassow; +Cc: device-mapper development, consult-list
On Thu, 2006-05-18 at 02:25 -0500, Jonathan E Brassow wrote:
> The system bus isn't a limiting factor is it? 64-bit PCI-X will get
> 8.5 GB/s (plenty), but 32-bit PCI 33MHz got 133MB/s.
>
> Can your disks sustain that much bandwidth? 10 striped drives might get
> better than 200MB/s if done right, I suppose.
>
> Don't the switches run at 2 Gbits/s? 2 Gbits/s / 10 (throw in 2 bits
> for protocol) ~= 200MB/s.
>
Thanks for the fast responses:
The card is a 64-bit PCI-X, so I don't think the bus is the bottleneck,
and anyway the vendor specifies a maximum throughput of 200Mbyte/s per
card.
The disk array does not appear to be the bottleneck because we get
200Mbyte/s when we use *two* HBAs in load-balanced mode.
The question is really about why we only see O(100Mbyte/s) with one HBA
when we can achieve O(200MByte/s) with two cards, given that one card
should be able to achieve that throughput.
I don't think the method of producing the traffic (bonnie++ or something
else) should be relevant but if it were that would be very interesting
for the benchmark authors!
The storage is an HDS 9980 (I think?)
> Could be a bunch of reasons...
>
> brassow
>
> On May 18, 2006, at 2:05 AM, Bob Gautier wrote:
>
> > Yesterday my client was testing of multipath load balancing and
> > failover
> > on a system running ext3 on a logical volume which comprises about ten
> > SAN LUNs all reached using multipath in multibus mode over two QL2340
> > HBAs.
> >
> > On the one hand, the client is very impressed: running bonnie++
> > (inspired by Ronan's GFS v VxFS example) we get just over 200Mbyte/s
> > over the two HBAs, and when we pull a link we get about 120MByte/s.
> >
> > The throughput and failover response times are better than the client
> > has ever seen, but we're wondering why we are not seeing higher
> > throughput per-HBA -- the QL2340 datasheet says it should manage
> > 200Mbyte/s and all switches etc. run at 2GBps.
> >
> > Any ideas?
> >
> > Bob Gautier
> > +44 7921 700996
> >
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel
> >
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Re: [Consult-list] dm-multipath has great throughput but we'd like more!
2006-05-18 7:36 ` Jonathan E Brassow
@ 2006-05-18 7:44 ` Luca Berra
0 siblings, 0 replies; 15+ messages in thread
From: Luca Berra @ 2006-05-18 7:44 UTC (permalink / raw)
To: dm-devel
On Thu, May 18, 2006 at 02:36:00AM -0500, Jonathan E Brassow wrote:
>
>On May 18, 2006, at 2:27 AM, Luca Berra wrote:
>
>>On Thu, May 18, 2006 at 08:19:09AM +0100, Bob Gautier wrote:
>>>On Thu, 2006-05-18 at 08:05 +0100, Bob Gautier wrote:
>>>>Yesterday my client was testing of multipath load balancing and
>>>>failover
>>>>on a system running ext3 on a logical volume which comprises about
>>>>ten
>>>>SAN LUNs all reached using multipath in multibus mode over two QL2340
>>>>HBAs.
>>>
>>>Sorry, should have also said: RHEL4u3 x86_64 on DL385 with 16Gb memory
>>>
>>you could have added the storage model :)
>>btw did you do any tuning (i.e. readhaead, scheduler) ?
>>
>
>You may also want to do raw through-put testing (lmdd perhaps) to get a
>baseline. Certainly, the file system has to be bringing down the
>bandwidth somewhat...
>
in some cases even the partition alignment might introduce delays.
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: dm-multipath has great throughput but we'd like more!
2006-05-18 7:44 ` Bob Gautier
@ 2006-05-18 7:55 ` Jonathan E Brassow
2006-05-18 7:59 ` Luca Berra
` (2 subsequent siblings)
3 siblings, 0 replies; 15+ messages in thread
From: Jonathan E Brassow @ 2006-05-18 7:55 UTC (permalink / raw)
To: rgautier; +Cc: device-mapper development, consult-list
On May 18, 2006, at 2:44 AM, Bob Gautier wrote:
> On Thu, 2006-05-18 at 02:25 -0500, Jonathan E Brassow wrote:
>> The system bus isn't a limiting factor is it? 64-bit PCI-X will get
>> 8.5 GB/s (plenty), but 32-bit PCI 33MHz got 133MB/s.
>>
>> Can your disks sustain that much bandwidth? 10 striped drives might
>> get
>> better than 200MB/s if done right, I suppose.
>>
>> Don't the switches run at 2 Gbits/s? 2 Gbits/s / 10 (throw in 2 bits
>> for protocol) ~= 200MB/s.
>>
>
> Thanks for the fast responses:
>
> The card is a 64-bit PCI-X, so I don't think the bus is the bottleneck,
> and anyway the vendor specifies a maximum throughput of 200Mbyte/s per
> card.
>
> The disk array does not appear to be the bottleneck because we get
> 200Mbyte/s when we use *two* HBAs in load-balanced mode.
>
> The question is really about why we only see O(100Mbyte/s) with one HBA
> when we can achieve O(200MByte/s) with two cards, given that one card
> should be able to achieve that throughput.
>
> I don't think the method of producing the traffic (bonnie++ or
> something
> else) should be relevant but if it were that would be very interesting
> for the benchmark authors!
>
> The storage is an HDS 9980 (I think?)
>
I guess I was thinking you were asking why you weren't getting 240MB/s,
and I overlooked the obvious question. I guess I don't know the answer
(or even the right questions). :(
brassow
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: dm-multipath has great throughput but we'd like more!
2006-05-18 7:44 ` Bob Gautier
2006-05-18 7:55 ` Jonathan E Brassow
@ 2006-05-18 7:59 ` Luca Berra
2006-05-18 8:04 ` [Consult-list] " Nicholas C. Strugnell
2006-05-18 20:28 ` Steve Lord
3 siblings, 0 replies; 15+ messages in thread
From: Luca Berra @ 2006-05-18 7:59 UTC (permalink / raw)
To: device-mapper development
On Thu, May 18, 2006 at 08:44:14AM +0100, Bob Gautier wrote:
>The card is a 64-bit PCI-X, so I don't think the bus is the bottleneck,
>and anyway the vendor specifies a maximum throughput of 200Mbyte/s per
>card.
>
>The disk array does not appear to be the bottleneck because we get
>200Mbyte/s when we use *two* HBAs in load-balanced mode.
>
>The question is really about why we only see O(100Mbyte/s) with one HBA
>when we can achieve O(200MByte/s) with two cards, given that one card
>should be able to achieve that throughput.
>
>I don't think the method of producing the traffic (bonnie++ or something
>else) should be relevant but if it were that would be very interesting
>for the benchmark authors!
>
>The storage is an HDS 9980 (I think?)
i am not an expert with Hitachi storages,
anyway
does each hba map to a different controller on the storage?
do you have some statistics on disk usage from the storage side?
L.
--
Luca Berra -- bluca@comedia.it
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Consult-list] Re: dm-multipath has great throughput but we'd like more!
2006-05-18 7:44 ` Bob Gautier
2006-05-18 7:55 ` Jonathan E Brassow
2006-05-18 7:59 ` Luca Berra
@ 2006-05-18 8:04 ` Nicholas C. Strugnell
2006-05-18 9:42 ` Nicholas C. Strugnell
2006-05-18 20:28 ` Steve Lord
3 siblings, 1 reply; 15+ messages in thread
From: Nicholas C. Strugnell @ 2006-05-18 8:04 UTC (permalink / raw)
To: rgautier; +Cc: device-mapper development, consult-list
[-- Attachment #1.1: Type: text/plain, Size: 1992 bytes --]
On Thu, 2006-05-18 at 08:44 +0100, Bob Gautier wrote:
> On Thu, 2006-05-18 at 02:25 -0500, Jonathan E Brassow wrote:
> > The system bus isn't a limiting factor is it? 64-bit PCI-X will get
> > 8.5 GB/s (plenty), but 32-bit PCI 33MHz got 133MB/s.
> >
> > Can your disks sustain that much bandwidth? 10 striped drives might get
> > better than 200MB/s if done right, I suppose.
> >
This is an HDS Lightning - 64GB of mirrored write cache - I doubt if any
of the writes even see disk :-)
> > Don't the switches run at 2 Gbits/s? 2 Gbits/s / 10 (throw in 2 bits
> > for protocol) ~= 200MB/s.
> >
>
> Thanks for the fast responses:
>
> The card is a 64-bit PCI-X, so I don't think the bus is the bottleneck,
> and anyway the vendor specifies a maximum throughput of 200Mbyte/s per
> card.
>
> The disk array does not appear to be the bottleneck because we get
> 200Mbyte/s when we use *two* HBAs in load-balanced mode.
>
> The question is really about why we only see O(100Mbyte/s) with one HBA
> when we can achieve O(200MByte/s) with two cards, given that one card
> should be able to achieve that throughput.
>
We've just done _exactly_ the same test against an EVA 8000 with 8
active paths - theoretically we should be able to get 4Gb/s via two HBAs
but in fact we saw max 200MB/s with ext2, dropping to 160MB/s with ext3
- this was to a fairly slow RAID5 but that is irrelevant as we have a
16GB write cache and we were only writing 4GB files with bonnie++
I'm not sure where the overhead is. The fact that we see a 20%
performance drop when we switch journalling on suggests that the
overhead might be in the filesystem perhaps?
It might make sense to test raw writes to a device with dd and see if
that gets comparable performance figures - I'll just try that myself
actually.
Nick
--
M: +44 (0)7736 665171 Skype: nstrug
http://europe.redhat.com
GPG FPR: 9C6C 093C 756A 6C57 49A1 E211 BBBA F5F5 C440 5DE0
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Consult-list] Re: dm-multipath has great throughput but we'd like more!
2006-05-18 8:04 ` [Consult-list] " Nicholas C. Strugnell
@ 2006-05-18 9:42 ` Nicholas C. Strugnell
2006-05-18 10:28 ` Richard Keech
2006-05-22 15:31 ` Ed Wilts
0 siblings, 2 replies; 15+ messages in thread
From: Nicholas C. Strugnell @ 2006-05-18 9:42 UTC (permalink / raw)
To: rgautier; +Cc: device-mapper development, consult-list
[-- Attachment #1.1: Type: text/plain, Size: 1621 bytes --]
On Thu, 2006-05-18 at 10:04 +0200, Nicholas C. Strugnell wrote:
> On Thu, 2006-05-18 at 08:44 +0100, Bob Gautier wrote:
> > On Thu, 2006-05-18 at 02:25 -0500, Jonathan E Brassow wrote:
> > > The system bus isn't a limiting factor is it? 64-bit PCI-X will get
> > > 8.5 GB/s (plenty), but 32-bit PCI 33MHz got 133MB/s.
> > >
> > > Can your disks sustain that much bandwidth? 10 striped drives might get
> > > better than 200MB/s if done right, I suppose.
> > >
>
> It might make sense to test raw writes to a device with dd and see if
> that gets comparable performance figures - I'll just try that myself
> actually.
write throughput to EVA 8000 (8GB write cache), host DL380 with 2x2Gb/s
HBAs, 2GB RAM
testing 4GB files:
on filesystems: bonnie++ -d /mnt/tmp -s 4g -f -n 0 -u root
ext3: 129MB/s sd=0.43
ext2: 202MB/s sd=21.34
on raw: 216MB/s sd=3.93 (dd if=/dev/zero of=/dev/mpath/3600508b4001048ba0000b00001400000 bs=4k count=1048576)
NB I did not have exclusive access to the SAN or this particular storage
array - this is a big corp. SAN network under quite heavy load and disk
array under moderate load - not even sure if I had exclusive access to
the disks. All values averaged over 20 runs.
The very low deviation of write speed on ext3 vs. exr2 or raw is
interesting - not sure if it means anything.
In any case, we don't manage to get very close to the theoretical
throughput of the 2 HBAs, 512MB/s
Nick
--
M: +44 (0)7736 665171 Skype: nstrug
http://europe.redhat.com
GPG FPR: 9C6C 093C 756A 6C57 49A1 E211 BBBA F5F5 C440 5DE0
[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Consult-list] Re: dm-multipath has great throughput but we'd like more!
2006-05-18 9:42 ` Nicholas C. Strugnell
@ 2006-05-18 10:28 ` Richard Keech
2006-05-22 15:31 ` Ed Wilts
1 sibling, 0 replies; 15+ messages in thread
From: Richard Keech @ 2006-05-18 10:28 UTC (permalink / raw)
To: Nicholas C. Strugnell; +Cc: device-mapper development, consult-list
[-- Attachment #1.1: Type: text/plain, Size: 1389 bytes --]
Nicholas C. Strugnell wrote:
>On Thu, 2006-05-18 at 10:04 +0200, Nicholas C. Strugnell wrote:
>
>
>>On Thu, 2006-05-18 at 08:44 +0100, Bob Gautier wrote:
>>
>>
>>>On Thu, 2006-05-18 at 02:25 -0500, Jonathan E Brassow wrote:
>>>
>>>
>>>>The system bus isn't a limiting factor is it? 64-bit PCI-X will get
>>>>8.5 GB/s (plenty), but 32-bit PCI 33MHz got 133MB/s.
>>>>
>>>>Can your disks sustain that much bandwidth? 10 striped drives might get
>>>>better than 200MB/s if done right, I suppose.
>>>>
>>>>
>>>>
>
>
>
>>It might make sense to test raw writes to a device with dd and see if
>>that gets comparable performance figures - I'll just try that myself
>>actually.
>>
>>
>
>write throughput to EVA 8000 (8GB write cache), host DL380 with 2x2Gb/s
>HBAs, 2GB RAM
>
>testing 4GB files:
>
>on filesystems: bonnie++ -d /mnt/tmp -s 4g -f -n 0 -u root
>
>ext3: 129MB/s sd=0.43
>
>
I presume this is with journal=ordered. Try with journel=writeback.
I've seen benchmarks which suggest it can be close to the speed of ext2.
--
Red Hat Home
www.redhat.com.au <http://www.redhat.com>
*Richard Keech*
Chief Technology Architect
Red Hat Asia-Pacific
email: rkeech@redhat.com <mailto:rkeech@redhat.com>
mobile: +61 419 036 463
Level 50, 120 Collins Street
Melbourne VIC 3000
phone: +61 3 9225 5258
fax: +61 3 9225 5050
support: 1800 733 428
[-- Attachment #1.2.1: Type: text/html, Size: 3687 bytes --]
[-- Attachment #1.2.2: logo_rh_home.png --]
[-- Type: image/png, Size: 1266 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Consult-list] dm-multipath has great throughput but we'd like more!
2006-05-18 7:05 dm-multipath has great throughput but we'd like more! Bob Gautier
2006-05-18 7:19 ` [Consult-list] " Bob Gautier
2006-05-18 7:25 ` Jonathan E Brassow
@ 2006-05-18 17:00 ` Rod Nayfield
2 siblings, 0 replies; 15+ messages in thread
From: Rod Nayfield @ 2006-05-18 17:00 UTC (permalink / raw)
To: rgautier; +Cc: dm-devel, consult-list
On Thu, 2006-05-18 at 08:05 +0100, Bob Gautier wrote:
> but we're wondering why we are not seeing higher
> throughput per-HBA -- the QL2340 datasheet says it should manage
> 200Mbyte/s and all switches etc. run at 2GBps.
Have you tried different values of rr_min_io in multipath.conf? I think
the default is 1000. (the last value of dmsetup table for the device)
-rod
p.s. you need to purge and recreate the map, just "multipath" will make
some changes but not rr_min_io. (bz 187534)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: dm-multipath has great throughput but we'd like more!
2006-05-18 7:44 ` Bob Gautier
` (2 preceding siblings ...)
2006-05-18 8:04 ` [Consult-list] " Nicholas C. Strugnell
@ 2006-05-18 20:28 ` Steve Lord
3 siblings, 0 replies; 15+ messages in thread
From: Steve Lord @ 2006-05-18 20:28 UTC (permalink / raw)
To: rgautier, device-mapper development; +Cc: consult-list
Provided you have things cabled right and you have 2 HBA ports going either
into a switch, or into the controllers of the raid (raid probably has 4
ports), then the theoretical bandwidth is closer to 400 Mbytes/sec. Pretty sure
any reasonable Hitachi raid will sustain close to that. Using other software and
raid hardware I can generally sustain 375 Mbytes/sec from 2 qlogic hba ports in
a fairly old dell server box, and that is going through 3 switches in the
middle.
You need to have sustained I/O which is directed at both sides of the
raid though. Not sure about the HDS 9980, but I think that is an
active/active raid, which means each controller can access each lun
in parallel. You really need to be striping your I/O across the luns
and controllers though. You can pull tricks to measure the fabric
capacity vs the storage bandwidth by using the raid's cache. Ensure you
have caching enabled in the raid, and have a file which is laid out
across multiple luns. Read a file which is a large percentage of the
cache size using o_direct (lmdd can be built with direct I/O support).
Then run the read again, if you did it right, you just eliminated the
spindles from the I/O.
Not sure about the hitachi raid again, but a lun would generally
belong to a controller on the raid, and there are usually two
controllers. Make sure that when you build the volume you stripe
luns so that they alternate between controllers. Then you need to
make sure that your I/Os are large enough to hit multiple disks
at once. There are lots of tricks to tuning this type of setup.
The problem with the load balancing in dm-multipath is that it is not
really load balancing, it is round robin, on a per lun basis I think,
it has no global picture of how much other load is currently going
to each HBA or controller port. The best you can do is drop the value
of rr_min_io in the /etc/multipath.conf file to a small value, try
something like 1 or 2.
Steve
Bob Gautier wrote:
> On Thu, 2006-05-18 at 02:25 -0500, Jonathan E Brassow wrote:
>> The system bus isn't a limiting factor is it? 64-bit PCI-X will get
>> 8.5 GB/s (plenty), but 32-bit PCI 33MHz got 133MB/s.
>>
>> Can your disks sustain that much bandwidth? 10 striped drives might get
>> better than 200MB/s if done right, I suppose.
>>
>> Don't the switches run at 2 Gbits/s? 2 Gbits/s / 10 (throw in 2 bits
>> for protocol) ~= 200MB/s.
>>
>
> Thanks for the fast responses:
>
> The card is a 64-bit PCI-X, so I don't think the bus is the bottleneck,
> and anyway the vendor specifies a maximum throughput of 200Mbyte/s per
> card.
>
> The disk array does not appear to be the bottleneck because we get
> 200Mbyte/s when we use *two* HBAs in load-balanced mode.
>
> The question is really about why we only see O(100Mbyte/s) with one HBA
> when we can achieve O(200MByte/s) with two cards, given that one card
> should be able to achieve that throughput.
>
> I don't think the method of producing the traffic (bonnie++ or something
> else) should be relevant but if it were that would be very interesting
> for the benchmark authors!
>
> The storage is an HDS 9980 (I think?)
>
>> Could be a bunch of reasons...
>>
>> brassow
>>
>> On May 18, 2006, at 2:05 AM, Bob Gautier wrote:
>>
>>> Yesterday my client was testing of multipath load balancing and
>>> failover
>>> on a system running ext3 on a logical volume which comprises about ten
>>> SAN LUNs all reached using multipath in multibus mode over two QL2340
>>> HBAs.
>>>
>>> On the one hand, the client is very impressed: running bonnie++
>>> (inspired by Ronan's GFS v VxFS example) we get just over 200Mbyte/s
>>> over the two HBAs, and when we pull a link we get about 120MByte/s.
>>>
>>> The throughput and failover response times are better than the client
>>> has ever seen, but we're wondering why we are not seeing higher
>>> throughput per-HBA -- the QL2340 datasheet says it should manage
>>> 200Mbyte/s and all switches etc. run at 2GBps.
>>>
>>> Any ideas?
>>>
>>> Bob Gautier
>>> +44 7921 700996
>>>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Consult-list] Re: dm-multipath has great throughput but we'd like more!
2006-05-18 9:42 ` Nicholas C. Strugnell
2006-05-18 10:28 ` Richard Keech
@ 2006-05-22 15:31 ` Ed Wilts
1 sibling, 0 replies; 15+ messages in thread
From: Ed Wilts @ 2006-05-22 15:31 UTC (permalink / raw)
To: device-mapper development; +Cc: consult-list
On Thu, May 18, 2006 at 11:42:36AM +0200, Nicholas C. Strugnell wrote:
> write throughput to EVA 8000 (8GB write cache), host DL380 with 2x2Gb/s
> HBAs, 2GB RAM
>
> testing 4GB files:
>
> on filesystems: bonnie++ -d /mnt/tmp -s 4g -f -n 0 -u root
>
> ext3: 129MB/s sd=0.43
>
> ext2: 202MB/s sd=21.34
>
> on raw: 216MB/s sd=3.93 (dd if=/dev/zero of=/dev/mpath/3600508b4001048ba0000b00001400000 bs=4k count=1048576)
>
>
> NB I did not have exclusive access to the SAN or this particular storage
> array - this is a big corp. SAN network under quite heavy load and disk
> array under moderate load - not even sure if I had exclusive access to
> the disks. All values averaged over 20 runs.
Since I manage a half-dozen EVAs, I'll pretend I actually know something
about them :-). First, there are multiple ways of setting up the LUNs
on the frame - anywhere from a small LUN with RAID5 to a large LUN with
raid 0. The differences should be significant. A small RAID5 LUN will
give you very limited balancing across physical disks. Because of the
virtualization of the disks within the frame, you most definitely do not
have exclusive access to the physical disks. It's quite possible that
your raid 5 partition is on the same physical disk as a very busy
database. The EVA spreads the lun across multiple spindles - the larger
the lun, the more spindles you can get working for you.
If you can, get the storage group to assign you a large raid 0 lun and
redo your tests. You should see different results.
.../Ed
--
Ed Wilts, RHCE
Mounds View, MN, USA
mailto:ewilts@ewilts.org
Member #1, Red Hat Community Ambassador Program
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2006-05-22 15:31 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-18 7:05 dm-multipath has great throughput but we'd like more! Bob Gautier
2006-05-18 7:19 ` [Consult-list] " Bob Gautier
2006-05-18 7:27 ` Luca Berra
2006-05-18 7:36 ` Jonathan E Brassow
2006-05-18 7:44 ` Luca Berra
2006-05-18 7:25 ` Jonathan E Brassow
2006-05-18 7:44 ` Bob Gautier
2006-05-18 7:55 ` Jonathan E Brassow
2006-05-18 7:59 ` Luca Berra
2006-05-18 8:04 ` [Consult-list] " Nicholas C. Strugnell
2006-05-18 9:42 ` Nicholas C. Strugnell
2006-05-18 10:28 ` Richard Keech
2006-05-22 15:31 ` Ed Wilts
2006-05-18 20:28 ` Steve Lord
2006-05-18 17:00 ` [Consult-list] " Rod Nayfield
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.