All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Randy Macleod" <macleodr@nortelnetworks.com>
To: user-mode-linux-devel@lists.sourceforge.net,
	tipc-discussion@lists.sourceforge.net
Subject: [uml-devel] Re: [tipc-discussion] tipc in networked UML
Date: Thu, 19 Aug 2004 15:27:25 -0400	[thread overview]
Message-ID: <4124FF1D.2040503@americasm01.nt.com> (raw)
In-Reply-To: <4123ACEA.6080600@ericsson.com>

   Hi,

   Comments and more info below.

// Randy

Jon Maloy wrote:
> See below
> /jon
> 
> Randy Macleod wrote:
> 
>  >
>  > Hello,
>  >
>  >    I'm simulating a small cluster of processors using
>  > UML (user-mode-linux.sourceforge.net)
>  > and TIPC (tipc.sf.net).
>  > Has anyone else tried this?
>  >
>  > The network is simulated using tun devices, i.e.
>  >
>  > cpu0 with eth0 in a UML is connected to pce.0.0.0
>  > cpu1 with eth0 in a UML is connected to pce.0.1.0
>  >
>  > A bridge device connects these pce(s) and forwards
>  > ethernet frames based on mac addrs. Packet delivery
>  > is communicated to the UML receiver by a signal according
>  > to Jeff Dike (the UML guy)
>  >
>  > Below is some output of /sbin/ifconfig -a on the desktop...
>  >
>  > Now when these UML's load, a tipc kernel module gets insmod'ed
>  > and things work to first order as expected.
>  > BUT...
>  > Several problems occur:
>  >
>  > 1. Connectionless communication is very unreliable.
>  >   (see my previous post for tfsend/tfrecv))
>  >   Only 100's of messages can be exchanged before getting
>  >   a sequence error.
> 
> At least this should make it easier to reproduce and track down the
> problem you identified in your previous mail...

   Yes. I've turned on the tipc logging and I'm coming up to
speed on tipc (and kernel) internals.

>  >
>  > 2. Resetting 1 node causes confusion of the tipc name table
>  > of other nodes.
>  >
>  > If I have 3 UML's (A,B,C),
>  >  - publish a tipcname on A,
>  >  - reset node C
>  >  - then node A get stuck periodically withdrawing the published name.

   I should have said:
     node A continually sends out periodic withdraw messages.

I repeated this test and saw different behaviour...

After node C was reset, things look pretty normal - the link to C gets
torn down, but
name publication seems to be broken (the name distribution
seems to work at the packet level but node B always reports
that the name of interest has been published even if I kill
the process that owns the name. I waited for quiet a while
for the timeout... This sounds like it could be the fixed bug
that you mention below. I'll see if that helps.

BTW, about a month ago, I extended the link tolerance
from 1500 (ms ?) to 15000. This helped avoid false
link down messages at a time when I didn't care about detecting
node failures. Now I do care about node failure and
I've added tolerance and maxinterval as insmod options.
I'm testing with tolerance=1600, maxinterval=400.


> 
> Do you mean node A hangs ? (Btw, are you running the latest version
> tipc-1.3.14 ?,

No, I'm stuck in linux-2.4 land running tipc-1.2.05.
I'm going either try umlinux-2.6 and the latest tipc or
backport tipc to (um)linux 2.4. I'll keep posting.

> I fixed a quite nasty bug in one of the later versions, where equal
> publications
> from different nodes got the same publication key, with the result that
> the wrong
> publication was removed sometimes)
> 
>  >
>  >
>  >
>  > 3. On a lightly loaded system with 10s of processes per UML,
>  >    there are frequently very long (> 5 seconds) tipc packet latencies
>  >    whereas the normal latency is 0.3 seconds. The desktop load
>  >    is always very low. If I ping each node every 0.1 seconds
>  >    the high latencies mostly go away. Seem's like the signal
>  >    is missed...
> 
>  >
>  >
>  > So, does TIPC assume real-time behaviour
>  > of packets on the network. i.e. by the time N packets have been
>  > sent, the other kernels have received the packets and will send
>  > flow control messages?
> 
> To work really well we assume that there is real parallelism, but TIPC
> should never fail because of long latency times. 
> I think I have a seen similar effect on VmWare.

And did you ever get a tipc network to behave properly
using vmware?

> 
>  >
>  >
>  > If this is the case then I think a minimal modification to UML
>  > to do a sched_yield() every N packets (in or out) would
>  > help matters. Furthermore, it may be a good idea to co-operatively
>  > schedule all the UML's so that their clocks are all in sync and
>  > no node can send or receive too many packets at one time.
>  >
>  > Comments?
>  >
>  > // Randy
>  >
>  >
>  >
>  >
>  > cpbr.0    Link encap:Ethernet  HWaddr 00:FF:3A:1A:CB:69
>  >           inet addr:10.0.254.1  Bcast:10.255.255.255  Mask:255.255.0.0
>  >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>  >           RX packets:52 errors:0 dropped:0 overruns:0 frame:0
>  >           TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
>  >           collisions:0 txqueuelen:0
>  >           RX bytes:1624 (1.5 Kb)  TX bytes:704 (704.0 b)
>  >
>  > pce.0.0.0 Link encap:Ethernet  HWaddr 00:FF:E2:5D:50:B0
>  >           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>  >           RX packets:3262 errors:0 dropped:0 overruns:0 frame:0
>  >           TX packets:2562 errors:0 dropped:149 overruns:0 carrier:0
>  >           collisions:0 txqueuelen:1024
>  >           RX bytes:452010 (441.4 Kb)  TX bytes:519063 (506.8 Kb)
>  >
>  > pce.0.1.0 Link encap:Ethernet  HWaddr 00:FF:3A:1A:CB:69
>  >           UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1
>  >           RX packets:6 errors:0 dropped:0 overruns:0 frame:0
>  >           TX packets:1333 errors:0 dropped:1079 overruns:0 carrier:0
>  >           collisions:0 txqueuelen:1024
>  >           RX bytes:228 (228.0 b)  TX bytes:302530 (295.4 Kb)
>  >
>  >
>  >
>  >
>  >
>  > -------------------------------------------------------
>  > SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
>  > 100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
>  > Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
>  > http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
>  > _______________________________________________
>  > tipc-discussion mailing list
>  > tipc-discussion@lists.sourceforge.net
>  > https://lists.sourceforge.net/lists/listinfo/tipc-discussion
> 
> 


-- 
// Randy MacLeod


-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

      reply	other threads:[~2004-08-19 19:27 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-18 18:35 [uml-devel] tipc in networked UML Randy Macleod
2004-08-18 19:24 ` [uml-devel] Re: [tipc-discussion] " Jon Maloy
2004-08-19 19:27   ` Randy Macleod [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4124FF1D.2040503@americasm01.nt.com \
    --to=macleodr@nortelnetworks.com \
    --cc=tipc-discussion@lists.sourceforge.net \
    --cc=user-mode-linux-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.