* [uml-devel] tipc in networked UML
@ 2004-08-18 18:35 Randy Macleod
2004-08-18 19:24 ` [uml-devel] Re: [tipc-discussion] " Jon Maloy
0 siblings, 1 reply; 3+ messages in thread
From: Randy Macleod @ 2004-08-18 18:35 UTC (permalink / raw)
Cc: user-mode-linux-devel, tipc-discussion
Hello,
I'm simulating a small cluster of processors using
UML (user-mode-linux.sourceforge.net)
and TIPC (tipc.sf.net).
Has anyone else tried this?
The network is simulated using tun devices, i.e.
cpu0 with eth0 in a UML is connected to pce.0.0.0
cpu1 with eth0 in a UML is connected to pce.0.1.0
A bridge device connects these pce(s) and forwards
ethernet frames based on mac addrs. Packet delivery
is communicated to the UML receiver by a signal according
to Jeff Dike (the UML guy)
Below is some output of /sbin/ifconfig -a on the desktop...
Now when these UML's load, a tipc kernel module gets insmod'ed
and things work to first order as expected.
BUT...
Several problems occur:
1. Connectionless communication is very unreliable.
(see my previous post for tfsend/tfrecv))
Only 100's of messages can be exchanged before getting
a sequence error.
2. Resetting 1 node causes confusion of the tipc name table
of other nodes.
If I have 3 UML's (A,B,C),
- publish a tipcname on A,
- reset node C
- then node A get stuck periodically withdrawing the published name.
3. On a lightly loaded system with 10s of processes per UML,
there are frequently very long (> 5 seconds) tipc packet latencies
whereas the normal latency is 0.3 seconds. The desktop load
is always very low. If I ping each node every 0.1 seconds
the high latencies mostly go away. Seem's like the signal
is missed...
So, does TIPC assume real-time behaviour
of packets on the network. i.e. by the time N packets have been
sent, the other kernels have received the packets and will send
flow control messages?
If this is the case then I think a minimal modification to UML
to do a sched_yield() every N packets (in or out) would
help matters. Furthermore, it may be a good idea to co-operatively
schedule all the UML's so that their clocks are all in sync and
no node can send or receive too many packets at one time.
Comments?
// Randy
cpbr.0 Link encap:Ethernet HWaddr 00:FF:3A:1A:CB:69
inet addr:10.0.254.1 Bcast:10.255.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:52 errors:0 dropped:0 overruns:0 frame:0
TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1624 (1.5 Kb) TX bytes:704 (704.0 b)
pce.0.0.0 Link encap:Ethernet HWaddr 00:FF:E2:5D:50:B0
UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
RX packets:3262 errors:0 dropped:0 overruns:0 frame:0
TX packets:2562 errors:0 dropped:149 overruns:0 carrier:0
collisions:0 txqueuelen:1024
RX bytes:452010 (441.4 Kb) TX bytes:519063 (506.8 Kb)
pce.0.1.0 Link encap:Ethernet HWaddr 00:FF:3A:1A:CB:69
UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
RX packets:6 errors:0 dropped:0 overruns:0 frame:0
TX packets:1333 errors:0 dropped:1079 overruns:0 carrier:0
collisions:0 txqueuelen:1024
RX bytes:228 (228.0 b) TX bytes:302530 (295.4 Kb)
-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 3+ messages in thread* [uml-devel] Re: [tipc-discussion] tipc in networked UML
2004-08-18 18:35 [uml-devel] tipc in networked UML Randy Macleod
@ 2004-08-18 19:24 ` Jon Maloy
2004-08-19 19:27 ` Randy Macleod
0 siblings, 1 reply; 3+ messages in thread
From: Jon Maloy @ 2004-08-18 19:24 UTC (permalink / raw)
To: Randy Macleod; +Cc: user-mode-linux-devel, tipc-discussion
See below
/jon
Randy Macleod wrote:
>
> Hello,
>
> I'm simulating a small cluster of processors using
> UML (user-mode-linux.sourceforge.net)
> and TIPC (tipc.sf.net).
> Has anyone else tried this?
>
> The network is simulated using tun devices, i.e.
>
> cpu0 with eth0 in a UML is connected to pce.0.0.0
> cpu1 with eth0 in a UML is connected to pce.0.1.0
>
> A bridge device connects these pce(s) and forwards
> ethernet frames based on mac addrs. Packet delivery
> is communicated to the UML receiver by a signal according
> to Jeff Dike (the UML guy)
>
> Below is some output of /sbin/ifconfig -a on the desktop...
>
> Now when these UML's load, a tipc kernel module gets insmod'ed
> and things work to first order as expected.
> BUT...
> Several problems occur:
>
> 1. Connectionless communication is very unreliable.
> (see my previous post for tfsend/tfrecv))
> Only 100's of messages can be exchanged before getting
> a sequence error.
At least this should make it easier to reproduce and track down the
problem you identified in your previous mail...
>
>
> 2. Resetting 1 node causes confusion of the tipc name table
> of other nodes.
>
> If I have 3 UML's (A,B,C),
> - publish a tipcname on A,
> - reset node C
> - then node A get stuck periodically withdrawing the published name.
Do you mean node A hangs ? (Btw, are you running the latest version
tipc-1.3.14 ?,
I fixed a quite nasty bug in one of the later versions, where equal
publications
from different nodes got the same publication key, with the result that
the wrong
publication was removed sometimes)
>
>
>
> 3. On a lightly loaded system with 10s of processes per UML,
> there are frequently very long (> 5 seconds) tipc packet latencies
> whereas the normal latency is 0.3 seconds. The desktop load
> is always very low. If I ping each node every 0.1 seconds
> the high latencies mostly go away. Seem's like the signal
> is missed...
>
>
> So, does TIPC assume real-time behaviour
> of packets on the network. i.e. by the time N packets have been
> sent, the other kernels have received the packets and will send
> flow control messages?
To work really well we assume that there is real parallelism, but TIPC
should never fail because of long latency times.
I think I have a seen similar effect on VmWare.
>
>
> If this is the case then I think a minimal modification to UML
> to do a sched_yield() every N packets (in or out) would
> help matters. Furthermore, it may be a good idea to co-operatively
> schedule all the UML's so that their clocks are all in sync and
> no node can send or receive too many packets at one time.
>
> Comments?
>
> // Randy
>
>
>
>
> cpbr.0 Link encap:Ethernet HWaddr 00:FF:3A:1A:CB:69
> inet addr:10.0.254.1 Bcast:10.255.255.255 Mask:255.255.0.0
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:52 errors:0 dropped:0 overruns:0 frame:0
> TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:1624 (1.5 Kb) TX bytes:704 (704.0 b)
>
> pce.0.0.0 Link encap:Ethernet HWaddr 00:FF:E2:5D:50:B0
> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
> RX packets:3262 errors:0 dropped:0 overruns:0 frame:0
> TX packets:2562 errors:0 dropped:149 overruns:0 carrier:0
> collisions:0 txqueuelen:1024
> RX bytes:452010 (441.4 Kb) TX bytes:519063 (506.8 Kb)
>
> pce.0.1.0 Link encap:Ethernet HWaddr 00:FF:3A:1A:CB:69
> UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
> RX packets:6 errors:0 dropped:0 overruns:0 frame:0
> TX packets:1333 errors:0 dropped:1079 overruns:0 carrier:0
> collisions:0 txqueuelen:1024
> RX bytes:228 (228.0 b) TX bytes:302530 (295.4 Kb)
>
>
>
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
> 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
> Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
> http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
> _______________________________________________
> tipc-discussion mailing list
> tipc-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* [uml-devel] Re: [tipc-discussion] tipc in networked UML
2004-08-18 19:24 ` [uml-devel] Re: [tipc-discussion] " Jon Maloy
@ 2004-08-19 19:27 ` Randy Macleod
0 siblings, 0 replies; 3+ messages in thread
From: Randy Macleod @ 2004-08-19 19:27 UTC (permalink / raw)
To: user-mode-linux-devel, tipc-discussion
Hi,
Comments and more info below.
// Randy
Jon Maloy wrote:
> See below
> /jon
>
> Randy Macleod wrote:
>
> >
> > Hello,
> >
> > I'm simulating a small cluster of processors using
> > UML (user-mode-linux.sourceforge.net)
> > and TIPC (tipc.sf.net).
> > Has anyone else tried this?
> >
> > The network is simulated using tun devices, i.e.
> >
> > cpu0 with eth0 in a UML is connected to pce.0.0.0
> > cpu1 with eth0 in a UML is connected to pce.0.1.0
> >
> > A bridge device connects these pce(s) and forwards
> > ethernet frames based on mac addrs. Packet delivery
> > is communicated to the UML receiver by a signal according
> > to Jeff Dike (the UML guy)
> >
> > Below is some output of /sbin/ifconfig -a on the desktop...
> >
> > Now when these UML's load, a tipc kernel module gets insmod'ed
> > and things work to first order as expected.
> > BUT...
> > Several problems occur:
> >
> > 1. Connectionless communication is very unreliable.
> > (see my previous post for tfsend/tfrecv))
> > Only 100's of messages can be exchanged before getting
> > a sequence error.
>
> At least this should make it easier to reproduce and track down the
> problem you identified in your previous mail...
Yes. I've turned on the tipc logging and I'm coming up to
speed on tipc (and kernel) internals.
> >
> > 2. Resetting 1 node causes confusion of the tipc name table
> > of other nodes.
> >
> > If I have 3 UML's (A,B,C),
> > - publish a tipcname on A,
> > - reset node C
> > - then node A get stuck periodically withdrawing the published name.
I should have said:
node A continually sends out periodic withdraw messages.
I repeated this test and saw different behaviour...
After node C was reset, things look pretty normal - the link to C gets
torn down, but
name publication seems to be broken (the name distribution
seems to work at the packet level but node B always reports
that the name of interest has been published even if I kill
the process that owns the name. I waited for quiet a while
for the timeout... This sounds like it could be the fixed bug
that you mention below. I'll see if that helps.
BTW, about a month ago, I extended the link tolerance
from 1500 (ms ?) to 15000. This helped avoid false
link down messages at a time when I didn't care about detecting
node failures. Now I do care about node failure and
I've added tolerance and maxinterval as insmod options.
I'm testing with tolerance=1600, maxinterval=400.
>
> Do you mean node A hangs ? (Btw, are you running the latest version
> tipc-1.3.14 ?,
No, I'm stuck in linux-2.4 land running tipc-1.2.05.
I'm going either try umlinux-2.6 and the latest tipc or
backport tipc to (um)linux 2.4. I'll keep posting.
> I fixed a quite nasty bug in one of the later versions, where equal
> publications
> from different nodes got the same publication key, with the result that
> the wrong
> publication was removed sometimes)
>
> >
> >
> >
> > 3. On a lightly loaded system with 10s of processes per UML,
> > there are frequently very long (> 5 seconds) tipc packet latencies
> > whereas the normal latency is 0.3 seconds. The desktop load
> > is always very low. If I ping each node every 0.1 seconds
> > the high latencies mostly go away. Seem's like the signal
> > is missed...
>
> >
> >
> > So, does TIPC assume real-time behaviour
> > of packets on the network. i.e. by the time N packets have been
> > sent, the other kernels have received the packets and will send
> > flow control messages?
>
> To work really well we assume that there is real parallelism, but TIPC
> should never fail because of long latency times.
> I think I have a seen similar effect on VmWare.
And did you ever get a tipc network to behave properly
using vmware?
>
> >
> >
> > If this is the case then I think a minimal modification to UML
> > to do a sched_yield() every N packets (in or out) would
> > help matters. Furthermore, it may be a good idea to co-operatively
> > schedule all the UML's so that their clocks are all in sync and
> > no node can send or receive too many packets at one time.
> >
> > Comments?
> >
> > // Randy
> >
> >
> >
> >
> > cpbr.0 Link encap:Ethernet HWaddr 00:FF:3A:1A:CB:69
> > inet addr:10.0.254.1 Bcast:10.255.255.255 Mask:255.255.0.0
> > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> > RX packets:52 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:0
> > RX bytes:1624 (1.5 Kb) TX bytes:704 (704.0 b)
> >
> > pce.0.0.0 Link encap:Ethernet HWaddr 00:FF:E2:5D:50:B0
> > UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
> > RX packets:3262 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:2562 errors:0 dropped:149 overruns:0 carrier:0
> > collisions:0 txqueuelen:1024
> > RX bytes:452010 (441.4 Kb) TX bytes:519063 (506.8 Kb)
> >
> > pce.0.1.0 Link encap:Ethernet HWaddr 00:FF:3A:1A:CB:69
> > UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
> > RX packets:6 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:1333 errors:0 dropped:1079 overruns:0 carrier:0
> > collisions:0 txqueuelen:1024
> > RX bytes:228 (228.0 b) TX bytes:302530 (295.4 Kb)
> >
> >
> >
> >
> >
> > -------------------------------------------------------
> > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
> > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
> > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
> > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
> > _______________________________________________
> > tipc-discussion mailing list
> > tipc-discussion@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/tipc-discussion
>
>
--
// Randy MacLeod
-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-08-19 19:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-18 18:35 [uml-devel] tipc in networked UML Randy Macleod
2004-08-18 19:24 ` [uml-devel] Re: [tipc-discussion] " Jon Maloy
2004-08-19 19:27 ` Randy Macleod
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.