* No idea about shaping trough many pc
@ 2008-01-10 9:06 Badalian Vyacheslav
2008-01-10 11:00 ` Denys Fedoryshchenko
2008-01-10 15:38 ` Lennart Sorensen
0 siblings, 2 replies; 4+ messages in thread
From: Badalian Vyacheslav @ 2008-01-10 9:06 UTC (permalink / raw)
To: netdev
Hello all.
I try more then 2 month resolve problem witch my shaping. Maybe you can
help for me?
Sheme:
+-------------------+
+ ----- | Shaping PC 1 | ---------+
/ +-------------------+ \
+--------+ / +--------------------+ \
+ --------+
| Cisco | +-------- | Shaping PC N | -----------+ -----| CISCO |
+--------+ \ +--------------------+ /
+---------+
\ +---------------------+ /
+ ----- | Shaping PC 20 | --------+
+---------------------+
Network - Over 10k users. Common bandwidth to INTERNET more then 1 GBs
All computers have BGP and turn on multipath.
Cisco can't do load sharing by Packet (its can resolve all my problems
=((( ). Only by DST IP, SRC IP, or +Level4.
Ok. User must have speed 1mbs.
Lets look variants:
1. Create rules to user = (1mbs/N computers). If user use N connection
all great, but if it use 1 connection his speed = 1mbs/N - its not look
good. All be great if cisco can PER PACKET load sharing =(
2. Create rules to user = 1mbs. If user use 1 connection all great, but
if it use N connection his speed much more then needed limit =(
Why i use 20 PC? Becouse 1 pc normal forward 100-150mbs... when it have
100% cpu usage on Sofware Interrupts...
Any idea how to resolve this problem?
In my dreams (feature request to netdev ;) ):
Get PC - title: MASTER TC. All 20 PC syncronize statistic with MASTER
and have common rules and statistic. Then i use variant 2 and will be
happy... but its not real? =(
Maybe have other variants?
Thanks for help!
Slavon.
P.S. Sorry for my english =(
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: No idea about shaping trough many pc
2008-01-10 9:06 No idea about shaping trough many pc Badalian Vyacheslav
@ 2008-01-10 11:00 ` Denys Fedoryshchenko
2008-01-10 11:23 ` Badalian Vyacheslav
2008-01-10 15:38 ` Lennart Sorensen
1 sibling, 1 reply; 4+ messages in thread
From: Denys Fedoryshchenko @ 2008-01-10 11:00 UTC (permalink / raw)
To: Badalian Vyacheslav, netdev
For proper link bandwidth sharing i guess something like network counters
have to be shared between PC's (with proper locking). I didn't heard anything
like this
IMHO a ways to do this:
Split destination network to multiple parts and do routes on Cisco. Let's say
you have:
192.168.0.0/16
and u have 4 balancing PC's
total bandwidth 1Gbit/s (speed conforming to IEC 1000Mbit/s in 1Gbit/s)
Then u do on cisco :
192.168.0.0/18 via PC1(shared speed 250Mbit/s)
192.168.64.0/18 via PC2(shared speed 250Mbit/s)
192.168.128.0/18 via PC3(shared speed 250Mbit/s)
192.168.192.0/18 via PC4(shared speed 250Mbit/s)
Probably you can do some scripts to check, if there is in some PC too much
available bandwidth (average 5 minutes), then you can give some other PC
which is need more bandwidth - more bandwidth. For example:
Average counters for 5minute shows:
PC1 - occupy 100Mbit/s
PC2 - -//- 50Mbit/s
PC3 - -//- 150Mbit/s
PC4 - -//- 230Mbit/s
Then u change link speed:
PC1 max 200
PC2 max 150
PC3 max 250
PC4 max 400 (100 from PC2 and 50 from PC1)
Sure PC must be capable to pass this traffic. And my IMHO it is not normal
that your PC's not able to handle more than 200Mbps of traffic. I have
complicated setup, with 4 LAN 8139 cards, which is passing totally 200Mbps
traffic. I am sure it can handle up to 300mbps, but already i am changing it
to PC with PCI-E e1000/broadcom netxtreme with offloading capabilities, large
buffers and proper drivers with NAPI. I have such hardware handling now for
example 160Mbps and counters is:
12:50:41 CPU %user %nice %sys %iowait %irq %soft %steal
%idle intr/s
12:50:42 all 0.00 0.00 0.00 0.00 0.25 1.24 0.00
98.51 4009.90
12:50:43 all 0.00 0.00 0.00 0.00 0.00 1.25 0.00
98.75 4024.75
12:50:44 all 0.00 0.00 0.00 0.00 0.00 1.50 0.00
98.50 4181.82
12:50:45 all 0.25 0.00 0.00 0.00 0.00 1.50 0.00
98.25 4626.73
12:50:46 all 0.00 0.00 0.00 0.00 0.00 1.50 0.00
98.50 4351.52
12:50:47 all 0.25 0.00 0.00 0.00 0.00 1.75 0.00
98.00 4805.88
It is 2.6.23.8 with some mistakes during configuration, i am doing to try
2.6.24-rc7 and some optimizations.
Right now profile looks like:
10957 17.0675 mwait_idle_with_hints
7454 11.6110 read_hpet
3883 6.0485 _raw_spin_lock
1605 2.5001 timer_interrupt
1363 2.1231 irq_entries_start
So maybe i will have to try change timers to TSC, disable nmi_watchdog and
try to tune up network driver (bnx2).
Probably you have to check such things too.
On Thu, 10 Jan 2008 12:06:35 +0300, Badalian Vyacheslav wrote
> Hello all.
> I try more then 2 month resolve problem witch my shaping. Maybe you
> can help for me?
>
> Sheme:
> +-------------------+
> + ----- | Shaping PC 1 | ---------+
> / +-------------------+ \
> +--------+ / +--------------------+ \
> + --------+ | Cisco | +-------- | Shaping PC N | -----------+ --
> ---| CISCO | +--------+ \ +--------------------+
> / +---------+ \ +---------
> ------------+ / + ----- | Shaping PC
> 20 | --------+ +---------------------+
>
> Network - Over 10k users. Common bandwidth to INTERNET more then 1
> GBs All computers have BGP and turn on multipath. Cisco can't do
> load sharing by Packet (its can resolve all my problems =((( ). Only
> by DST IP, SRC IP, or +Level4. Ok. User must have speed 1mbs. Lets
> look variants:
> 1. Create rules to user = (1mbs/N computers). If user use N
> connection all great, but if it use 1 connection his speed = 1mbs/N -
> its not look good. All be great if cisco can PER PACKET load
> sharing =(
> 2. Create rules to user = 1mbs. If user use 1 connection all great,
> but if it use N connection his speed much more then needed limit =(
>
> Why i use 20 PC? Becouse 1 pc normal forward 100-150mbs... when it
> have 100% cpu usage on Sofware Interrupts...
>
> Any idea how to resolve this problem?
>
> In my dreams (feature request to netdev ;) ):
> Get PC - title: MASTER TC. All 20 PC syncronize statistic with
> MASTER and have common rules and statistic. Then i use variant 2 and
> will be happy... but its not real? =( Maybe have other variants?
>
> Thanks for help!
> Slavon.
> P.S. Sorry for my english =(
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Denys Fedoryshchenko
Technical Manager
Virtual ISP S.A.L.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: No idea about shaping trough many pc
2008-01-10 11:00 ` Denys Fedoryshchenko
@ 2008-01-10 11:23 ` Badalian Vyacheslav
0 siblings, 0 replies; 4+ messages in thread
From: Badalian Vyacheslav @ 2008-01-10 11:23 UTC (permalink / raw)
To: Denys Fedoryshchenko; +Cc: netdev
Thanks for answer!
1 PC have more then 15k TC rules.... tc get many cpu si...
I don't wont split networks on cisco, becouse i need HA. If PC do
shutdown - another must get all function. BGP work great for this solution.
But have troubles in control common traffic from users.
Will be great if i can create common TC rules that share to all PC and
its have common data (statistic, info, tokens, and e.t.c.). Maybe netdev
think about system mode that we have one OS on many PC (Like HA
Cluster). Linux may do great routing solution!
Will be great if all 20 PC may be 1 logical PC that have many CPU and
may process lots of interrupts!
Thanks =)
> For proper link bandwidth sharing i guess something like network counters
> have to be shared between PC's (with proper locking). I didn't heard anything
> like this
>
> IMHO a ways to do this:
> Split destination network to multiple parts and do routes on Cisco. Let's say
> you have:
> 192.168.0.0/16
> and u have 4 balancing PC's
> total bandwidth 1Gbit/s (speed conforming to IEC 1000Mbit/s in 1Gbit/s)
>
> Then u do on cisco :
> 192.168.0.0/18 via PC1(shared speed 250Mbit/s)
> 192.168.64.0/18 via PC2(shared speed 250Mbit/s)
> 192.168.128.0/18 via PC3(shared speed 250Mbit/s)
> 192.168.192.0/18 via PC4(shared speed 250Mbit/s)
>
> Probably you can do some scripts to check, if there is in some PC too much
> available bandwidth (average 5 minutes), then you can give some other PC
> which is need more bandwidth - more bandwidth. For example:
>
> Average counters for 5minute shows:
> PC1 - occupy 100Mbit/s
> PC2 - -//- 50Mbit/s
> PC3 - -//- 150Mbit/s
> PC4 - -//- 230Mbit/s
>
> Then u change link speed:
> PC1 max 200
> PC2 max 150
> PC3 max 250
> PC4 max 400 (100 from PC2 and 50 from PC1)
>
> Sure PC must be capable to pass this traffic. And my IMHO it is not normal
> that your PC's not able to handle more than 200Mbps of traffic. I have
> complicated setup, with 4 LAN 8139 cards, which is passing totally 200Mbps
> traffic. I am sure it can handle up to 300mbps, but already i am changing it
> to PC with PCI-E e1000/broadcom netxtreme with offloading capabilities, large
> buffers and proper drivers with NAPI. I have such hardware handling now for
> example 160Mbps and counters is:
> 12:50:41 CPU %user %nice %sys %iowait %irq %soft %steal
> %idle intr/s
> 12:50:42 all 0.00 0.00 0.00 0.00 0.25 1.24 0.00
> 98.51 4009.90
> 12:50:43 all 0.00 0.00 0.00 0.00 0.00 1.25 0.00
> 98.75 4024.75
> 12:50:44 all 0.00 0.00 0.00 0.00 0.00 1.50 0.00
> 98.50 4181.82
> 12:50:45 all 0.25 0.00 0.00 0.00 0.00 1.50 0.00
> 98.25 4626.73
> 12:50:46 all 0.00 0.00 0.00 0.00 0.00 1.50 0.00
> 98.50 4351.52
> 12:50:47 all 0.25 0.00 0.00 0.00 0.00 1.75 0.00
> 98.00 4805.88
>
> It is 2.6.23.8 with some mistakes during configuration, i am doing to try
> 2.6.24-rc7 and some optimizations.
>
> Right now profile looks like:
> 10957 17.0675 mwait_idle_with_hints
> 7454 11.6110 read_hpet
> 3883 6.0485 _raw_spin_lock
> 1605 2.5001 timer_interrupt
> 1363 2.1231 irq_entries_start
>
> So maybe i will have to try change timers to TSC, disable nmi_watchdog and
> try to tune up network driver (bnx2).
> Probably you have to check such things too.
>
> On Thu, 10 Jan 2008 12:06:35 +0300, Badalian Vyacheslav wrote
>
>> Hello all.
>> I try more then 2 month resolve problem witch my shaping. Maybe you
>> can help for me?
>>
>> Sheme:
>> +-------------------+
>> + ----- | Shaping PC 1 | ---------+
>> / +-------------------+ \
>> +--------+ / +--------------------+ \
>> + --------+ | Cisco | +-------- | Shaping PC N | -----------+ --
>> ---| CISCO | +--------+ \ +--------------------+
>> / +---------+ \ +---------
>> ------------+ / + ----- | Shaping PC
>> 20 | --------+ +---------------------+
>>
>> Network - Over 10k users. Common bandwidth to INTERNET more then 1
>> GBs All computers have BGP and turn on multipath. Cisco can't do
>> load sharing by Packet (its can resolve all my problems =((( ). Only
>> by DST IP, SRC IP, or +Level4. Ok. User must have speed 1mbs. Lets
>> look variants:
>> 1. Create rules to user = (1mbs/N computers). If user use N
>> connection all great, but if it use 1 connection his speed = 1mbs/N -
>> its not look good. All be great if cisco can PER PACKET load
>> sharing =(
>> 2. Create rules to user = 1mbs. If user use 1 connection all great,
>> but if it use N connection his speed much more then needed limit =(
>>
>> Why i use 20 PC? Becouse 1 pc normal forward 100-150mbs... when it
>> have 100% cpu usage on Sofware Interrupts...
>>
>> Any idea how to resolve this problem?
>>
>> In my dreams (feature request to netdev ;) ):
>> Get PC - title: MASTER TC. All 20 PC syncronize statistic with
>> MASTER and have common rules and statistic. Then i use variant 2 and
>> will be happy... but its not real? =( Maybe have other variants?
>>
>> Thanks for help!
>> Slavon.
>> P.S. Sorry for my english =(
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>
> --
> Denys Fedoryshchenko
> Technical Manager
> Virtual ISP S.A.L.
>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: No idea about shaping trough many pc
2008-01-10 9:06 No idea about shaping trough many pc Badalian Vyacheslav
2008-01-10 11:00 ` Denys Fedoryshchenko
@ 2008-01-10 15:38 ` Lennart Sorensen
1 sibling, 0 replies; 4+ messages in thread
From: Lennart Sorensen @ 2008-01-10 15:38 UTC (permalink / raw)
To: Badalian Vyacheslav; +Cc: netdev
On Thu, Jan 10, 2008 at 12:06:35PM +0300, Badalian Vyacheslav wrote:
> Hello all.
> I try more then 2 month resolve problem witch my shaping. Maybe you can
> help for me?
>
> Sheme:
> +-------------------+
> + ----- | Shaping PC 1 | ---------+
> / +-------------------+ \
> +--------+ / +--------------------+ \
> + --------+
> | Cisco | +-------- | Shaping PC N | -----------+ -----| CISCO |
> +--------+ \ +--------------------+ /
> +---------+
> \ +---------------------+ /
> + ----- | Shaping PC 20 | --------+
> +---------------------+
>
> Network - Over 10k users. Common bandwidth to INTERNET more then 1 GBs
> All computers have BGP and turn on multipath.
> Cisco can't do load sharing by Packet (its can resolve all my problems
> =((( ). Only by DST IP, SRC IP, or +Level4.
> Ok. User must have speed 1mbs.
> Lets look variants:
> 1. Create rules to user = (1mbs/N computers). If user use N connection
> all great, but if it use 1 connection his speed = 1mbs/N - its not look
> good. All be great if cisco can PER PACKET load sharing =(
> 2. Create rules to user = 1mbs. If user use 1 connection all great, but
> if it use N connection his speed much more then needed limit =(
>
> Why i use 20 PC? Becouse 1 pc normal forward 100-150mbs... when it have
> 100% cpu usage on Sofware Interrupts...
I have managed forwarding of 600Mbps using about 15% CPU load on a
500MHz Geode LX, using 4 100Mbit pcnet32 interfaces and a small tweak to
how the NAPI is implemented on it. Adding traffic shapping and such to
the processing would certainly increase the CPU load, but hopefully not
by much. The reason I didn't get more than 600Mbps was that the PCI bus
is now full.
> Any idea how to resolve this problem?
>
> In my dreams (feature request to netdev ;) ):
> Get PC - title: MASTER TC. All 20 PC syncronize statistic with MASTER
> and have common rules and statistic. Then i use variant 2 and will be
> happy... but its not real? =(
> Maybe have other variants?
Well now sure about synchornizing and all that. I still think if I can
manage 600Mbps forwarding rate using a slow poke Geode then a modern CPU
like a Q6600 with a number of PCIe gig ports should be able to do quite
a lot.
The tweak I did was to add a timer to the driver that I can activate
whenever I finish emptying the receive queue. When the timer expires it
adds the port back to the NAPI queue, and when it is called again the
poll will either process whatever packets arrived during the delay, or
it will actually unmask the IRQ and go back to IRQ mode. The delay I
use is 1 jiffy, and I run with 1000HZ and set the queues to 256 packets,
since 1ms at 100MBps can provide at most about 200 packets (64byte worst
case). I simply check whenever I empty the queue how many packets I
just processed. If greater than 0, I enable the timer to expire on the
next jiffy and leave the port masked after removing port from napi
polling, and if it was 0 then I must have been called again after the
timer expired and still had no packets to process in which case I unmask
the IRQ and don't enable the timer. I had to change the HZ to 1000
since at 250 or 100 I wouldn't be able to handle the worst case number
of packets (the pcnet32 has a maximum of 512 packets in a queue).
With NAPI the normal behaviour is that whenever you empty the receive
queue, you reenable IRQs, but it doesn't take that fast a CPU to
actually empty the queue all the time and then you end up with the
overhead for masking IRQs everytime you receive packets, process them,
and then the overhead of unmasking the IRQ just to within a fraction of
a milisecond getting an IRQ for the next packet. With the delay until
the next jiffy for unmasking the IRQ you end up causing a potential lag
on processing packets of up to 1ms, although on average less than that,
but the IRQ load drops dramatically and the overhead of managing the IRQ
masking and the IRQ handler goes away. In the case of this system the
CPU load dropped from 90% at 500Mbps to 15% at 600Mbps, and the
interrupt rate dropped from one IRQ every couple of packets, to one IRQ
at the start of each burst of packets.
I believe some GB ethernet ports and most 10Gig ports have the ability
to do delayed IRQ where they wait for a certain number of packets before
generating an IRQ, which is pretty much what I tried to emulate with my
tweak and it sure works amazingly well.
--
Len Sorensen
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-01-10 15:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-01-10 9:06 No idea about shaping trough many pc Badalian Vyacheslav
2008-01-10 11:00 ` Denys Fedoryshchenko
2008-01-10 11:23 ` Badalian Vyacheslav
2008-01-10 15:38 ` Lennart Sorensen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).